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On the Interaction of Roundoff Noise 
and Dynamic Range in Digital Filters* 


By LELAND B. JACKSON 
(Manuscript received October 22, 1969) 


The interaction between the roundoff-noise output from a digital filter 
and the associated dynamic-range limitations 1s investigated for the case 
of uncorrelated rounding errors from sample to sample and from one error 
source to another. The required dynamic-range constraints are derived in 
terms of L, norms of the input-signal spectrum and the transfer responses 
to selected nodes within the filter. The concept of ‘transpose configurations” 
is introduced and is found to be quite useful in digital-filter synthesis; for 
although such configurations have identical transfer functions, their round- 
off-noise outputs and dynamic-range limitations can be quite different, 
in general. Two transpose configurations for the direct form of a digital 
filter are used to illustrate these results. 


I, INTRODUCTION 


With the rapid development of digital integrated circuits in the 1960’s 
and the potential for large-scale integration (LSI) of these circuits in 
the 1970’s, digital signal processing has become much more than a tool 
for the simulation of analog systems or a technique for the 1mplementa- 

* This paper is taken in part from a thesis submitted by Leland B. Jackson 


in partial fulfillment of the requirements for the degree of Doctor of Science in 
the Department of Electrical Engineering at Stevens Institute of Technology.t 
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tion of very complex and costly one-of-a-kind systems alone. The 
traditional advantages of digital systems, such as high accuracy, stable 
parameter values, and straight-forward realization, have been supple- 
mented through the use of integrated circuits by the additional advant- 
ages of high reliability, small circuit size, and ever-decreasing cost. 
As a result, it now appears that many signal processing systems which 
have been in the exclusive domain of analog circuits may in the future 
be implemented using digital circuits; while other proposed systems 
which could not be implemented at all because of the practical limita- 
tions of analog circuits may now be realized with digital circuits.” 

The key element in most of these new signal-processing systems is 
the digital filter. The term ‘‘digital filter’? here denotes a time-invariant, 
discrete or sampled-data filter with finite accuracy in the representation 
of all data and parameter values.° ° That is, all data and parameters 
within the filter are ‘‘quantized”’ to a finite set of allowable values with, 
in general, some form of error being incurred as a result of the quantiza- 
tion process. Implicit in this quantization is a maximum value or set 
of maximum values for the magnitudes of these data and parameters 
which, in the case of the data, is usually referred to as the “dynamic 
range”’ of the filter. 

Without the above quantization effects, linear discrete filters could 
be implemented exactly. Of course, one very significant feature of 
digital signal processing is that arbitrarily high accuracy can, in fact, 
be maintained once the initial analog-to-digital (A-D) conversion (if 
any) has taken place. However, there are still practical limitations to 
the accuracy of any physical system, and often it is desirable to mini- 
mize the accuracy of the implementation (while still satisfying the 
system specifications) in order to minimize the cost of the system. Hence, 
a thorough understanding of quantization errors in digital filters is 
quite important if the full potential of digital signal processing 1s ever 
to be realized. 


II. QUANTIZATION ERRORS IN DIGITAL FILTERS 


The specific sources of quantization error in the implementation and 
operation of a digital filter are as follows: 


(tz) The filter coefficients (multiplying constants) must be quantized 
to some finite number of digits (usually binary digits, or bits). 
(2) The input samples to the filter must also be quantized to a 
finite number of digits. 
(zat) The products of the multiplications (of data by coefficients) 
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within the filter must usually be rounded or truncated to a smaller 
number of digits. 

(zv) When floating-point arithmetic is used, rounding or truncation 
must usually be performed before or after additions as well. 


The first source of error above is deterministic and straightforward 
to analyze in that the filter characteristics must simply be recomputed 
to reflect the (small) changes in the filter coefficients due to quantizing.°”’ 
However, the inclusion of coefficient quantization in the initial filter 
synthesis procedure in order to minimize (in some sense) the resulting 
filter complexity produces a complex problem in nonlinear integer 
programming which has only begun to be investigated. 

The second source of error is often referred to as ‘‘quantization noise’. 
It is inherent in any A-D conversion process and has been studied in 
great depth.® Hence, input quantization has not been included in our 
investigation, except as it relates to other error sources of interest. 

The third and fourth error sources are similar to the second since they 
also involve quantization of the data, but they differ in two respects: 
(i) The data to be quantized is already digital in form, and (ii) the round- 
ing or truncation of the data takes place at various points within the 
filter, not just at its input. To distinguish these sources of error from the 
input quantization noise, the resulting error processes will be referred 
to as “roundoff noise’ (to be used generically, whether rounding or 
truncation is actually employed). Because of (ii), the roundoff noise is 
potentially much larger than the input quantization noise, and it is one 
of the principal factors which determine the complexity of the digital 
filter implementation, especially when special-purpose hardware is used. 

There are three variables in the filter implementation which deter- 
mine the level and character of the roundoff noise for a given input signal: 


(4) the number of digits (bits) used to represent the data within 
the filter, 
(it) the “mode” of arithmetic employed (that is, fixed-point or 
floating-point), and 
(272) the circuit configuration of the digital filter. The number of 
digits in the data may be thought of as determining either the quantiza- 
tion step size or the dynamic range of the filter. We choose here the 
latter interpretation in order to have the same step size for all filters. 
Therefore, with this interpretation, the number of data digits does not 
affect the level of the roundoff noise directly, but rather it limits the 
maximum allowable signal level and hence the realizable signal-to-noise 
ratio. Data within the filter must, of course, be properly “‘scaled”’ if the 


162 THE BELL SYSTEM TECHNICAL JOURNAL, FEBRUARY 1970 


maximum signal-to-noise ratio is to be maintained without exceeding the 
dynamic-range limitations. Among the principal results reported here 
are the determination of appropriate scaling for certain important classes 
of input signals and the calculation of the effect of this scaling on the 
output roundoff noise. 

The output roundoff noise from a floating-point digital filter is usually 
(but not always) less than that from a fixed-point filter with the same © 
total number of data digits because of the automatic scaling provided 
by floating-point arithmetic.’’’” However, since floating-point arithmetic 
is significantly more complex and costly to implement, most special- 
purpose digital filters have been, and will probably continue to be, con- 
structed with fixed-point hardware. Hence, we have considered only 
fixed-point digital filters in this work although much of the analysis 
could be adapted to floating-point filters. Oppenheim has recently 
proposed another interesting mode of arithmetic for digital filter 1m- 
plementation, called “block-floating-point’’, which provides a simplified 
form of automatic scaling of the filter data.’ As would be expected, the 
performance of block-floating-point appears to lie somewhere between 
those of fixed-point and of floating-point. 

The third variable in the implementation of a digital filter, that of 
circuit configuration, is the principal factor determining the character 
(spectrum) of the output roundoff noise and, along with mode of the 
arithmetic, ultimately determines the number of data digits required to 
satisfy the performance specifications. In fact, the key step in the syn- 
thesis of a digital filter is the selection of an appropriate configuration 
for the digital circuit. There are a multitude of equivalent circuit con- 
figurations for any given linear discrete filter (whose transfer function 
is expressible as a rational fraction in z); but in the implementation of the 
corresponding digital filter, these configurations are no longer equivalent, 
in general, because of the effects of coefficient quantization and roundoff 
noise. As noted previously, the effects of coefficient quantization are 
deterministic and can thus be accounted for exactly as a (typically 
small) change in the transfer function of the discrete filter. Therefore, 
assuming that the coefficients for the configurations under consideration 
have been (or can be) quantized satisfactorily, the choice between these 
configurations is then determined by the level and character of their 
output roundoff noise. As we will show, there can be very significant 
differences between the roundoff-noise outputs of otherwise equivalent 
digital filter configurations. 

The content and complexity of any analysis of roundoff noise are 
determined to a large extent by the assumed correlation between round- 
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off errors. If these errors may be assumed to be uncorrelated from sample 
to sample and from multiplier (or other rounding point) to multiplier, 
then the roundoff-noise analysis is relatively straightforward, and the 
results are independent of the exact nature of the input signal to the 
filter. If, on the other hand, uncorrelated errors may not be assumed, 
then the analysis is much more complex, and the results are generally 
dependent on the particular input signal or class of input signals. This 
paper is concerned exclusively with the uncorrelated-error case because 
this assumption seems to be valid for most filters with input signals of 
reasonable amplitude and spectral content. Even in this case, the in- 
clusion of the associated dynamic-range constraints makes the analysis 
reasonably involved and the corresponding synthesis problem quite 
complex. 

Although the generic term ‘‘roundoff noise’ has been used to include 
the case of truncation as well as rounding, we actually concentrate on 
the rounding case. As long as the assumption of uncorrelated errors can 
be made, our results are applicable to either case, with the error variance 
for truncation being four times that for rounding. However, as the 
input signals become less ‘“‘random”’, the uncorrelated-error assumption 
tends to break down for truncation more readily than for rounding. 
Hence, additional care must be exercised in applying these results to the 
truncation case. 


Ill. FILTER MODEL FOR UNCORRELATED-ROUNDOFF-NOISE ANALYSIS 


The analyses appearing in the literature concerning roundoff noise 
in digital filters usually employ the simplifying and often reasonable 
assumption of uncorrelated roundoff errors from sample to sample and 
from one error source (multiplier or other rounding point) to 
another.” This assumption is based on the intuitively plausible 
and experimentally supported notion that for sufficiently large and 
dynamic signals within the filter, the small roundoff error made at one 
point in the network and/or in time should have little relationship to 
(that is, correlation with) the roundoff error made at any other point 
in the network and/or time. The advantage of assuming uncorrelated 
errors from one sample to another is that the noise injected into the 
filter by each rounding operation is then “white’’; while the advantage 
of assuming uncorrelated error sources is that the output noise power 
spectrum may then be computed as simply the superposition of the 
(filtered) noise spectra due to the separate error sources.” Experimental 
results which support the validity of this assumption, even in the case 
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of a single sinusoidal input, are presented in Ref. 1. In this section, we 
introduce the notation and develop the analysis pertaining to uncor- 
related roundoff noise for later use in investigating the synthesis of 
digital filters. 

Digital filter networks are composed of three basic elements: adders, 
constant multipliers, and delays. The interconnection of these elements 
into a particular network configuration is the key step in digital filter 
synthesis. or our purposes here, we need only consider the network 
as a directed graph, with the multipliers and delays being represented 
by graph branches. The branch interconnection points, or nodes, will 
be divided into two types: “summation nodes”, which correspond to 
the adders and have multiple inputs and a single output, and “‘branch 
nodes”’, which correspond to simple “‘wired”’ interconnections that have 
a, single input and one or more outputs. 

A digital filter network may thus be represented as shown in Fig. 
1. The input to and output from the filter at time ¢ = nTJ are denoted 
by u(n) and y(n), respectively. The corresponding output from the 
a branch node is denoted by v;(n); while the roundoff error introduced 
into the filter at the j** summation node is denoted by e;(n). Since 
with fixed-point arithmetic, rounding is performed only after multiplica- 
tions, non-zero roundoff errors are ‘“‘input’’ to the filter only at those 
summation nodes which follow constant (non-integer) multiplier 
branches, as depicted in Fig. 2. 
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Fig. 1— General digital filter model. 


DIGITAL FILTER 165 


Vi(n) ej(n) 


D> 


Fig. 2— Constant multiplier with preceding branch node and succeeding sum- 
mation node. 


— a = 


For a unit sample input to the filter at £ = 0 and no rounding [that is, 
u(O) = 1, u(n) = Oforn ¥ 0, ande,(n) = 0 for all j and n], the resulting 
output values y(n) and v;(n) for all nm = O and all z are designated as 
h(n) and f;(n), respectively. Alternatively, for a unit sample input to 
the j summation node and zero inputs otherwise [that is, e;(0) = 1, 
e;(n) = Oforn ¥ 0, and e,(n) = u(n) = 0 for all n and for k ¥ jj, the 
resulting output values y(n) for all 7 2 O are denoted by g;(n). We 
thus have the following transfer functions of interest, expressed in 
z-transform form: 

From filter input to output: 


| H*(z) = >> A)z”. (1) 
n=0 
From filter input to 7" branch-node output: 
F*@) = Dd fine”. (2) 
n=0 


From j* summation-node input to filter output: 


Gt) =D ails. (3) 


These transfer functions are indicated in Fig. 1. 
The frequency responses (Fourier transforms) corresponding to the 
above transfer functions are given by *~” 


Hw) = H*@’*"), (4) 
F(w) = FFE'*"*), (5) 
G',(w) = GEe'**). (6) 
This notation will be used throughout this paper. That is, for any 
z-transform A*(z) which converges for |z| = 1, the corresponding 


Fourier transform is given by 


A(w) = A*(e*”), 
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If scaling has been included in the filter design in order to satisfy certain 
dynamic-range constraints, then prime marks (’) are added to denote 
this fact [for example, F'}(w), F'*’(z)]. 

Each error source (rounding operation) within the filter is assumed 
to inject white noise of uniform power-spectral density N.. Assuming 
uniformly distributed rounding errors with zero mean, the variance of 
the roundoff noise from each error source is given by*”’* 


g. = A*/12 (7) 


where A is the spacing of the quantization steps (after rounding). To 
eliminate the sampling period 7’ from certain expressions of interest, 
we now define Ny = o}. Hence, the variance, or total average power, 
corresponding to an arbitrary power-density spectrum N(w) with no 
DC component (which implies a zero-mean process) is given by" 


a” -- [ve dw (8) 


where w, 1s the radian sampling frequency given by 
w, = Qn/T. | (9) 


Assume now that k; error sources input to the 7** summation node. 
The spectral density of the roundoff error sequence {e,;(n)} is then just 
k,;N, by our assumption of uncorrelated error sources. The total roundoff 
noise in the output of the filter thus has a power-density spectrum given 
by” 


N,() = 0 » k; | Gj) | (10a) 
where we have substituted o} for Ny . If scaling has been included in the 
filter design, then the corresponding expression is just 


N,(w) = 09 2, ki | G@) (10b) 
where ki = k; to account for the additional scaling multipliers. 


IV. DYNAMIC-RANGE CONSTRAINTS 


The ultimate objective of the synthesis procedures to be investigated 
will be the minimization of some norm of N,(w) for a given quantization 
step size A, subject to certain ‘‘constraints’’. One constraint is that the 


+ This normalization of N(w) is further motivated by the derivation in Section 
V leading to equation (30b). 
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specified transfer function H*(z) must be maintained. Another funda- 
mental, but often overlooked, constraint is the finite dynamic range of 
the filter. Specifically, the signals v;(n) at certain branch nodes within 
the filter cannot be allowed to ‘‘overflow” (that is, exceed the dynamic- 
range limitations), at least not more than some small percentage of the 
time, 1n order to prevent severe distortion in the filter output. 

Overflow constraints are required only at certain branch nodes in the 
digital circuit because it 1s only the inputs to the constant multipliers 
which cannot be allowed to overflow when several standard numbering 
systems are used (for example, one’s- or two’s-complement binary).’* 
Specifically, in the summation of more than two numbers, if the magni- 
tude of the correct total sum is small enough to allow its representation 
by the K available digits, then in these numbering systems the correct 
total sum will be obtained regardless of the order in which the numbers 
are added, even if an overflow occurs in one of the partial sums. Hence, 
those node outputs which correspond to partial sums comprising a 
larger total sum may be allowed to overflow, as long as the total sum is 
constrained not to overflow. This property also applies when one of the 
inputs to a summation node has overflowed as a result of a multiplica- 
tion by a coefficient of magnitude greater than one. 

Turning to the formulation of the required overflow constraints, we 
may easily derive an upper bound on the magnitude of the signals 
v,(n) for all possible input sequences {u(n)}, neglecting the (small) 
error signals e;(n). Assuming zero initial conditions in the filter and 
e;(n) = O for all j and n, the 2* branch-node output v;(n) is given by 


—0,{n) = > f.(iuln —k), all n. (11) 


Therefore, given that u(n) is bounded in magnitude by some number 
M for all n, an upper bound on the magnitude of v,(n) is given by*® 


lo SM DIF, all m = 2) 


Thus, if the node signal v;(n) is also to be bounded in magnitude by 
M for all possible input sequences, the associated scaling must ensure 
that 


r3 | #(k) | $1. | (13) 


That (13) is not only a sufficient condition to rule out overflow for all 
possible input sequences {u(n)}, but also a necessary condition, is easily 
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shown by letting u(n) = + for all n, with sgn [w(m) — k)] = sgn [f;(4)] 
for some m = 7 and all k 2 0. Then from equation (11) we see that 
(12) is satisfied with equality in this case, and thus (18) is a necessary 
condition, as well. 

The norm of f/(k) employed in (13) is not very useful in practice 
because of the difficulty of evaluating the indicated summation in all 
but the simplest cases. Also, for large classes of input signals, (12) and 
thus (18) are overly pessimistic. Therefore, we now derive alternate 
conditions on (the transform of) the scaled unit-sample response {f/(7)} 
which ensure that for certain classes of input signals, the corresponding 
branch-node output v;(n) cannot overflow. The derivation of these 
conditions for discrete systems closely parallels the corresponding 
derivation for continuous systems, as given by Papoulis.”° 

An alternate expression for equation (11) in terms of 2-transforms 1s 
derived as follows: Consider an (absolutely summable) deterministic 
input sequence {u(n)} possessing the z-transform 


U*() = 2 u(nje” a<|z|<)b, (14) 
forsome a < land b > 1. Stability requires that F*(z), defined in equa- 
tion (2), exist for all | z| > cfor some c < 1. Hence, the z-transform of 
{v;(n)} is given by” 

Vi) = F¥)U*(), d<|lz|<b, (15) 


where d = max (a, c). The inverse transform of equation (15) is given 
by® 


iG = ‘ Vi@e" de (16) 


where the contour of integration I is contained in the region of con- 
vergence d < |z| < 6. Sinced < 1 and b > 1, let I be the unit circle 
in the z plane (| z| = 1), and perform the change of variables z = e’°” 
in equation (16). Using equation (15), the resulting equation becomes 


oda) = = | " P@)U@E? de. (ay 


The conditions to be derived from equation (17) are most easily 
expressed in terms of L, norms, defined ” or an arbitrary periodic function 
A(-) with period w, by” 


N4te=[2f"14@ Pa] aa) 
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for each real p 2 1 such that 
[| 4@ Pde < ©. 
0 


It can be shown” that for A(-) continuous, the limit of equation (18a) 
as p — © exists and is given by 
| A Jl. = max | AG) |, (18b) 


Assume now that | U(w) | is bounded from above by some number 
M (that is, || U ||. S$ M). Then, from equation (17), 


jvm) |S M> [| PG) | de 
Ws Jo 
or 

| v(m) | S |] Pe |ls-|] OF [leo (19) 
In exactly the same manner, we may also show that 

[os(m) | S |[ Fs |[o- |] U lh. (20) 
Applying the Schwarz inequality to equation (17), on the other hand, 
yields that 


an a 
or 
Jos) | S |] Fe [lel] Ole (21) 
Note that (19), (20), and (21) are all of the form 


om SUPT, (+t=i1) 


for p,q = 1, 2, and ~. Jt can be shown” that (22) is true in general for 
allp,g>1 satisfying 1/p + 1/q¢ = 1; and we have shown in (19) and 
(20) that if the Z,, norms exist, then (22) holds for p, g = 1, as well. 
The general relation in (22) for all », g > 1, is derived from Holder’s 
inequality. — 

A simple, but important special case of (22) results from jstaiig F*(z) 
= F,;(w) = 1. Since || 1 ||, = 1 for all p = 1, we then have simply 


jum) | S|] UO |, alk g21. (23) 


170 THE BELL SYSTEM TECHNICAL JOURNAL, FEBRUARY 1970 


But since (23) holds for all sequences {u(n)}, it must also be true that 
lu(m){| S|[Vell,, all r2. 


This is, in fact, the basis of (22), for Holder’s inequality actually states 
that 


Ve lh S | Ps 





Pears (244-1), 

P 
Therefore, the real implication of (22) is that the mean absolute value 
of V,;(w) is bounded by || F; |l,|| U ||,, and this, in turn, provides a 
bound on | v,(n) |. - 

Assume, therefore, that the input transform U(w) satisfies || U ||, S 
for some g = 1. From (23) we immediately have that | w(n) | < M for 
all n. Then, if | v;(n) | is also to be bounded by M, (22) provides a 
sufficient condition on the scaling to ensure this, namely 


(Fil,bSL (vl. =”) (24) 


for p = q/(q — 1). Inequality (24) is the desired condition to replace 
the more general, but often less useful condition given by (18). 

From an engineering viewpoint, the most significant values for p 
and q would seem to be 1, 2, and ~. The case p = 1, g = © requires 
that the input transform U(w) be everywhere bounded in magnitude by 
M (that is, || U ||. S$ 42), in which case only the ZL, norm of the scaled 
transfer function F’/(w) need satisfy (24). For an input of finite energy 
E = >), u’(n), Parseval’s identity implies that || U ||? = £, and thus 
with M = (£)', (24) can be satisfied for p = q = 2. 

The case of p = ©, g = 1 in (24) implies the most stringent condition 
on F'/(w) because from equation (18) it is evident that 


ee [lp SUL PE Ihe (25) 


for all p 2 1. It is clear, for example, that for a sinusoidal input of 
amplitude A S M and arbitrary frequency w) , we must have | F/(w) | 
< 1 for all w (that is, || 7% ||. < 1) to ensure that |v,(n)| S$ M for 
all n. However, a sinusoidal input sequence {u(n)} is not absolutely 
summable, and thus U*(z) as defined in equation (14) does not exist 
in this case. This difficulty may be circumvented, as is common in 
Fourier analysis, by assuming a finite sequence of length N and then 
passing to the limit as N — o. The resulting (Fourier) transform of 
{u(n)} is of the form 








Uo(w) = = [ao — @) + 6@ — w, + &)], (0SwS.,) (26) 
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where 6(w) is the familiar Dirac delta function defined by 


jw) = 0, w #0, 
63 (27) 
/ 6) dw = I. 


U,(w) is, of course, periodic in w with period w, . From equations (18a), 
(26), and (27), we immediately have that || Uo ||1 = A < M, and thus 
with p = ©, (24) is applicable for sinusoidal input sequences, as ex- 
pected. 


V. RANDOM INPUT CASE 


In the case of random input sequences, (24) is not directly applicable 
because the z-transform U*(z) is not defined. Similar conditions may be 
obtained, however, by considering the discrete autocorrelation function 
y(-), defined for a (wide-sense) stationary sequence {w(n)} by 


v(m) = Elw(n)w(n + m)] (28) 


where E[-] is the expected-value operator. A z-transform *(z) may be 
defined for the sequence {¢,,(m)} as in equation (14) with an inverse 
transform as in (16). Assuming ergodicity and a zero mean (E[w(n)] = 0) 
for {w(n)}, we immediately have from equation (28) that the variance, 
or total average power, of {w(n)} is given by 


go(0) = E[w*(n)] = o. (29) 


and from equation (16) we also have 


¢,(0) = e $ : b*(2)z* dz. (30a) 


Letting I’ be the unit circle (2 = e'*”), equations (29) and (30a) imply 
that 
; if’ 
. = / ,,(w) dw. (80b) 
Ws Jo 
Hence, from equation (8) we see that ©,,(w) is just the power-density 
spectrum of the sequence {w(n)}. 
For an input sequence {u(n)} whose autocorrelation function has 
the z-transform *(z), it is well-known that the corresponding transform 
for the output {v;(n)} is given by 


B5,(2) = FF@)FIE)B@) (31a) 
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or 
®,.(w) = | F.@) |? $.@). (31b) 
Equations (29) through (31) imply then that 
gee: = | Plo) 2 B,(co) deo. (32) 
s “0 


Since equation (32) is of the same basic form as (17), a derivation similar 
to that leading to (22) must yield the following relations for p, g = 1: 


| 1,1 
aS lPillell@lle, (b+4=4) (33a) 
P qd 
or, from equation (17), 
2 2 1 1 
a SUF lll, (b+4=1): (33) 
P qd 
Two cases of (83) are of particular interest, namely 
on S || Ps (lee || & ||. (34) 
and 
ove S | Fe [Jo |] & [th - (35) 


In view of equation (25), we see that (34) implies the most stringent 
condition on the input spectrum ©®,(w), whereas (385) yields the most 
stringent condition on the transfer function F';(w). From (84) and (30b), 
for example, we have that if the input power-density spectrum is ‘‘white”’ 
[that is, &,(w) = of for all w], then o?, < || F; ||3o2 . Hence, if the input 
sequence {u(n)} is a Gaussian process,” the node output sequence 
{v;(n)} will overflow no more (in percentage of time) than does the 
input, provided only that 


| #% |l2 S 1. (36) 


The inequality in (85) requires, on the other hand, that for an input 
sinusoid of arbitrary amplitude and frequency, F4(w) must satisfy 


Pe [le 1 (37) 


to ensure against overflow, as we have seen earlier from (24). 
To summarize, dynamic-range constraints of the form 


|Fi,>s1, pel (38) 


have been derived for both deterministic and random inputs, where 
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F4(w) is the (scaled) transfer response from the filter input to the 7® 
branch node and || - ||, denotes the L, norm defined in equation (18). 
For a deterministic input with amplitude spectrum U(w), (38) assumes 
that 


WU |SM, ga (39) 


where M is the maximum allowable signal amplitude. For a random 
input, on the other hand, the use of (88) requires appropriate conditions 
on || &. ||,,7 = p/(p — 2) and p = 2, where ©,(w) is the power-density 
spectrum of the input sequence. 

The effect of (38) and (39) is to bound the mean absolute value of the 
amplitude spectrum at the z* branch node (that is, || V; ||.) which, in 
turn, bounds the peak signal amplitude at that node. The use of (88) 
in conjunction with (83), however, bounds only the average power at 
the 7 branch node, and thus the relationship between this average 
power and the peak signal amplitude at the node must also be deter- 
mined in order to provide an effective dynamic-range constraint. 


VI. TRANSPOSE SYSTEMS 


In the evaluation of different circuit configurations for a given digital 
filter, a useful concept relating certain of these configurations is that 
of “transpose configurations’. This relationship is a general property 
of linear graphs” and will be presented here in terms of a state-variable 
formulation. 

The general state equations for a linear, time-invariant discrete system 
are given by” 


x(n + 1) = Ax(n) + Bun), 
y(n) = Cx(n) + Dum) 


where x(n) is an N-dimensional vector describing the state of the system 
at time ¢ = nT’, u(n) is the corresponding J-dimensional input vector, 
y(n) is the corresponding J-dimensional output vector, and A, B, C, 
and D are fixed parameter matrices of the appropriate dimensions relat- 
ing the input, state, and output vectors as given by equation (40). 
The (VN + I) X (VN + J) matrix S defined by 


(40) 


oe 7 , (41) 
CD 
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provides a convenient single parameter matrix which describes the 
complete discrete system. 

A transfer function matrix 3C*(z) may be defined for the system (de- 
scribed by) S relating the input and output vector sequences {u(n)} 
and {y(n)} by 


¥*(@) = 5g (2)U*(z) (42) 


where U*(z) and Y*(z) are the vector 2-transforms of {u(n)} and {y(n)}, 
respectively. 3C%(z) is readily shown to be given by” 


ac*(z) = C(eI — A) “B+ D (43) 


where (:)~* denotes the matrix inverse and J is the N-dimensional 
identity matrix. 

Consider now a new system which is described by the parameter 
matrix S‘, that is, 


Ae (6h 
BD’ 
where (-)’ denotes the matrix transpose. From equations (41) and (43) 


it is easily seen that the transfer function matrix for the new system 
S‘ is given by 


S' = (44) 








5¢%.(2) = BY(eI — A')C' + D' 
= [5c4(2)]'. 


Thus, the transfer function matrix for the system S‘ is simply the 
transpose of the transfer function matrix for the system S. That is, 
the element H(z) from 3C%(z), which is the transfer function from the 
j* input to the 7 output of system S, equals the element H*;(z) from 
ac%.(z), that is, the transfer function from the z** input to the 7 output 
of S*. Note also that while the system S has a total of J inputs and I 
outputs, the system S’ has J inputs and J outputs. 

The concept of transpose systems will be particularly useful to us in 
conjunction with the digital-filter model introduced in Section II and 
depicted in Fig. 1. Defining the input and output vectors for the filter by 


(45) 


u(n) y(n) 
u(r) = {2 | and y@) = {2 (46) 


es(n) v1) 
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respectively, the transfer function matrix for the filter is given by 
H*(z) Gi@ ++: G5@) 
We). Sa eee 





5e*(2) = (47) 


i er 


WO) Sa a 


where the specific expressions for the elements in other than the first row 
and first column are unimportant for our purposes. By equation (45), 
the transfer function matrix for the corresponding transpose system is 
then simply 


H*(z) FT@) +++ Fr@) 


Ge) —- —- —}. 


CK (z) = (48) 


CG) 3 


Note, in particular, that the transfer function from input-1 to output-1 
[that is, H*(z), the ideal transfer function from filter input to filter 
output] is the same for both systems. 

As discussed more fully in Ref. 1, the circuit configuration realizing a 
given system S is not necessarily unique, and hence neither is the con- 
figuration for the transpose system S*. However, given a particular 
configuration for the system S, a unique “transpose configuration’, 
which realizes S‘, may be derived from the given configuration for S 
by simply reversing the direction of all branches in the given network! 
In particular, then, all delays and constant multipliers remain the same 
except for the change in direction. All summation nodes in the given 
configuration become branch nodes in the transpose configuration, and 
all branch nodes become summation nodes. Likewise, all inputs in the 
given configuration become outputs in the transpose configuration, and 
all outputs become inputs." 

That the transpose configuration defined above actually realizes the 
transpose system S’° is easily seen by considering the state equations in 
(40). The constant multiplier(s) corresponding to the element d;; of the 
matrix D and relating the 7 input and the 7 output of the original 
configuration must relate the z* input and the 7 output of the transpose 

+ Note that the transpose system S' is fundamentally different from the “ad- 


joint” system22 because, although the signal flow is reversed in both, the trans- 
pose system does not run “backwards in time.” 
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configuration, and thus d;; = d;, for all z and j. The multiplier(s) cor- 
responding to the element b;; of B and relating the 7 input and the 
7 state of the original configuration must, on the other hand, relate 
the 2 state and the 7** output of the transpose configuration, and thus 
b;; = c;,; for all ¢ and j. Similarly, c;; = 6;,; for all z and 7. Finally, the 
multiplier(s) corresponding to a;; and relating x;(m) and x;(n + 1) in 
the original configuration must, in the transpose configuration, relate 
x(n) and x;(n + 1), and thus a,;; = a;,; for all 7 and 7. Therefore, the 
transpose configuration indeed realizes the system S’. 


VII. AN EXAMPLE: THE DIRECT FORM 


To demonstrate the application of the results of the preceding sections, 
we now evaluate and compare the roundoff-noise outputs from two 
transpose configurations for a digital filter. The scaling required to 
satisfy the overflow constraints in (88) is derived, and the effect of this 
scaling on the output roundoff noise is determined. 

The transfer function H*(z), defined in equation (1) and relating 
the input and output of the digital filter, may be expressed as a rational 
function in z of the form’’* 


HN 2 il _ A*@) 
ANG) = "= BG) 
1+ >) ba 


t=] 





(49) 


Assuming that ay and by are not both zero, N is referred to as the “order” 
of the filter. There are many different, but equivalent, forms in which 
equation (49) may be written, with a number of equivalent circuit 
configurations corresponding to each of these forms (at least two trans- 
pose configurations). Those forms such as equation (49) which require 
the minimum number of multiplications and additions in the general 
case (that is, 2N + 1 and 2N, respectively) are referred to as “‘canonical”’ 
forms. In general, however, it is necessary to add additional scaling 
multipliers to these canonical forms in order to satisfy the overflow 
constraints in (88). 

The form of H*(z) given in equation (49) is often called the ‘‘direct 
form” of a digital filter. It has been pointed out by Kaiser’ that use of 
the direct form is usually to be avoided because of the sensitivity of the 
roots of higher-order polynomials to small variations (that is, quantiza~ 
tion errors) in the polynomial coefficients. The roundoff-noise outputs 
from the direct form can also be much larger than from other canonical 
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forms.” Nevertheless, the direct form is of theoretical interest, and it 
provides a convenient illustration of our results. Similar investigations 
for the two canonical forms most commonly employed in practice—the 
cascade and parallel forms—are described in Ref. 1. 

Two transpose configurations which implement the direct form with 
scaling are shown in Figs. 3 and 4. These configurations actually realize 
H*(z) in the form 


H*(z) a eet 2 ie (50) 


where ,a, = a,/Ki, and the additional scaling multipliers Ki, k = 
1, 2, are required to satisfy (88) in the general case. The configuration 
in Fig. 3 will be designated as form 1 (that is, k = 1), and Fig. 4 as 
form 2 (that is, k = 2). 

The branch nodes at which overflow constraints are required (because 
these signals input to multipliers) are indicated by (*). The dynamic- 
range limitations are obviously satisfied (by assumption) at the input to 
the filter, but for completeness, an overflow constraint is included there 
as indicated. The scaled transfer responses ,f/(w) to these nodes are 
noted in Figs. 3 and 4, and the corresponding unscaled responses ,/°;(w) 
apply, of course, when K{ = 1 


u(n) 





Fig. 8 — Direct form 1 with scaling. 
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Fig. 4— Direct form 2 with scaling. 


It is intuitively clear that to preserve the greatest possible signal- 
to-noise ratio, the scaling should reduce the magnitude of ,F/(w) no more 
than is necessary (or should increase it as much as possible, as the case 
may be). In other words, ,f%(w) should satisfy 


alt |p = 1. (51) 
This condition will be satisfied if the scaling factors ,s;, defined by 
E(w) = 28s2ls(e), (52a) 
are given by 
i = 1/|) PF; | - (52b) 
It is readily seen from Figs. 3 and 4 that 
FPiw) = Fi) = 1, (53) 


and hence equation (51) is automatically satisfied for these responses. 
Of more interest, however, are the responses 


Ky 


F3(w) = Bo) 





= Ky iF) (54) 
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and 


, = Hw) _ ol", (w) | 


From equations (52), (54), and (55), it follows that (51) is satisfied 
for these configurations if (and only if) 


Ky = 1/|| 1/B ||, (56) 


and 


Ky = || H |l,. (57) 


The rounding-error inputs e;(n) are also shown in Figs. 3 and 4 
along with the transfer responses ,G/(w) from these inputs to the output 
of the filter. Note that in form 2 (Fig. 4) the error input e.(n) incor- 
porates the roundoff errors from all of the multipliers except K even 
though these error sources are separated by delays (z°*). This is done 
for convenience and is possible because of the assumption of uncorrelated 
errors from sample to sample and source to source. The noise weights 
k‘ [see equation (10)] for form 1 are thus 


Ah = ky = N +1; (58a) 
while for form 2, 
ai = 1 and okt = 2N +1. (58b) 


The indices 7 and j of the ,F';(w) and ,G;(w) have been assigned in such 
a way that forms 1 and 2 are related as in equations (47) and (48). 
That is, these unscaled responses satisfy the following equations: 


(vw) = Go), = 1, 2; (59a) 
i1G;(w) = of ;w), j= Aes (59b) 


Note that the scaled responses ,F/(w) and ,G/(w) are not related as in 
equation (59) because, in general, K/ # K‘. In particular, 








/ = H (w) _ €) iN 
iGy (w) —— Ki = Ki ols (w) ; (60) 
while | 
i oy Se sy 
23(w) 2 B(w) , (Ee 3). (61) 


However, we do have, as in equation (53), that 


1G) = 2Gi(w) = 1. (62) 
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From equations (10) and (53) through (62), the power-spectral den- 
sities of the roundoff-noise outputs from these two configurations are 
thus computed to be 








Ny) = oo(N + me a 3 | ZT) r} (63a) 


and 





1 2 
Be) } ve 
The variances, or total average powers of the output roundoff noise 
from these configurations are then, from equations (8) and (18), simply 


Pies ta 


LNs) = o1 + ON +1) [HAE 


1 
|My |, B 











oo(N + {I ae 


and 














1 2 
B 1}. (64b) 


The peak noise densities || ,4, ||. are, on the other hand, bounded by 


NaN, [hk = 01 + ON +0) (IE 


A 











Prick 5a) 


| sW, lle S o3(N + me +4 


and 


IA 











pI 


We now compare direct forms 1 and 2 on the basis of (64) and (65). 
Although comparisons based on bounds for || ,N, ||. as in (65) do not, 
of course, necessarily hold for || ,N, ||.. itself, experimental results have 
indicated that such comparisons are quite effective qualitatively, and 
often quantitatively as well.’ Consider first the expressions in equation 
(64) for p = 2 and in (65) for p = © (that is, || N, ||,, 7 = 1, ©, for 
p = r+ 1]. In these two cases, the only difference between the (a) 
and (b) expressions for forms 1 and 2, respectively, are the k/ , as given 
in equation (58). In particular, for || 1/B ||?|| ||? >> 1 as is often the 
case, the || NV, ||, for form 1 are approximately half, or 3 db less than, 
those for form 2. This result simply reflects the fact that only half of the 
noise sources in form 1 input at other than the filter output; whereas 
in form 2, all but one input within the filter. Hence, if the gains from 
these inputs to the output are large, form 1 is preferable to form 2 by 
up to 3 db. 


aN, lle S ot + ON +0) (HIE 


DIGITAL FILTER 181 


For p ~ r + 1, however, the differences in the k/ are of secondary 
importance compared with the potential differences due to the mixture 
of L, and Z,, norms in (64) and (65). In particular, letting 


m= |B | nace, (66a) 


we immediately see that if 0.2 >> 62, , then form 2 is better for p = © 
while form 1 is better for p = 2. If, on the other hand, 6... << 42, , then 
the opposite applies. 

To gain insight into the above conditions, we rewrite equation (66a) 


as 


2 2 


A 


B B 


It is then clear that the difference between 6... and @,, is due entirely 
to the effect of A(w) on the LZ, norms of A(w)/B(w) for g = 2, © versus 
the corresponding norms of 1/B(w). In particular, A(w) affects the L, 
norm in 6,,. But the Z,, norm of a function ‘‘concentrates’”’ exclusively 
on the maximum absolute value of that function; whereas the L. norm 
of a function reflects the r.m.s. absolute value of that function over 
all argument values. Therefore, the effect of A(w) in 6... results from the 
alteration of the maxima of | 1/B(w) | in | A@w)/B() |; while in 0.2 , 
the effect concerns the difference between | 1/B(w) | and | A(w)/B(e) | 
over all w. 

Intuitively, one expects that the former effect 1s potentially much 
greater; that is, in many cases A(w) should affect the L,, norm in 42, 
much more than the Z, norm in 6,2. In particular, if | A(w) | signifi- 
cantly attenuates the maxima of | 1/B(w) | [as in a band-rejection filter, 
for example], then @,,, should be much smaller than 6,,. . In this case, 
form 2 should be used for p = o, and form 1 for p = 2. If, however, 
| A(@w) | does not provide such attenuation, then | A(w) | must be rela- 
tively constant within the band(s) where | 1/B(w) | is largest [by the 
nature of A(w)], and hence 


1 








6,9 = | (66b) 


qa 

















Pp 

















A 1 
. | Ate) [-|| 4] (67) 
where w, is a frequency at or near a maximum of | 1/B(w) |. But then, 
1 1 
Bnq ~ | A (wo) | B 2 B : ~ Dap ’ (68) 


























and the difference between direct forms 1 and 2 should be less in this 
case. 
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Vill. SUMMARY 


The interaction between the roundoff-noise output from a digital 
filter and the associated dynamic-range limitations has been investigated 
for the case of uncorrelated rounding errors from sample to sample and 
from one error source to another. The spectrum of the output roundoff 
noise from fixed-point implementations was readily shown to be of the 
form 


N,(w) = 09 2 ki | Gi@) (69) 


where the G/(w) are scaled transfer responses from certain “‘summation 
nodes” in the digital circuit to the filter output. «3 is the variance of the 
rounding errors from each multiplier (or other rounding point), and the 
k‘ are integers indicating the number of error inputs to the respective 
summation nodes. 

Defining F'4(w) to be the scaled transfer response from the input to 
the 2 “branch node”’ at which a dynamic-range constraint is required, 
constraints of the form 


Fil, $1 (70) 


for p = 1 were then derived, where || F ||, is the L, norm of the response 
I'!(w). The appropriate value of p is determined by assumed conditions 
on the spectra of the input signals to the filter. The effect of (70) is to 
bound the maximum signal amplitude (for deterministic inputs) or the 
maximum average power (for random inputs) at the z7** branch node. 

A state-variable description was employed to formulate the general 
concept of “transpose configurations” for a digital network and to 
illustrate the usefulness of this concept in digital-filter synthesis. A 
particularly important result is that for a given unscaled configuration 
with transpose responses /’;(w) and G;(w), as described above, the re- 
sponses F'\(w) and G}(w) for the corresponding transpose configuration 
are given by 


F'(w) = Gj(w) and Gi(w) = F,(w). (71) 


Hence, although the overall transfer functions for these two configura- 
tions are the same, their roundoff-noise outputs can be quite different, 
in general. The transpose configuration is obtained by simply reversing 
the direction of all branches in the given network configuration, and 
the poles and zeros of the network are thus realized in reverse order in 
the transpose configuration. 

To illustrate these results, the roundoff-noise spectra N,(w) for two 
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transpose configurations for the direct form of a digital filter were cal- 
culated and compared. The direct form should usually be avoided in 
practice,’ but it is still of theoretical interest and provides a convenient 
example of our general approach. Using a very natural assignment of 
the indices 2 and 7 for the unsealed F';(w) and G;(w), equation (69) was 
shown to be of the form 


Nilo) = {hs + DRUK LGO FY) 


for these (scaled) configurations for the direct form, where M is the 
number of error inputs at other than the output of the filter. Hence, the 
variance, or total average power, of the output roundoff noise is simply 


AL 
oh = oN bees + SBE IPs AE IG) IBY a3) 
j=l 
while the peak spectral density || N, ||.. is bounded by 


M 
IN, eS oifhin + SRR MEP 8 


Identical expressions to (72) through (74) can also be derived for the 
parallel and cascade forms of a digital filter.” The relationship between 
the noise outputs of corresponding transpose configurations 1s Immedi- 
ately indicated by (71) through (74) [although, in general, ki # k+’]. 
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An Optimization Method for 
Cascaded Filters 


By SHLOMO HALFIN 
(Manuscript received August 1, 1969) 


This paper presents a procedure for decomposing an nth order filter 
unto cascadable second order sections. The procedure is optimal in that it 
minimizes the maximal response range for the sections within the frequency 
band of interest. The procedure, based on a modified version of the Bottle- 
neck Assignment Algorithm, describes methods of listing all the optimal 
decompositions as well as of finding a special ‘‘nested’”’ optimal decom- 
position. 


I. INTRODUCTION 


Let ¢(s) be a transfer function 


_ Is) 
$(s) = g(s) 


where f and g are polynomials with real coefficients, and the degree of 
f sS degree of g. 

We consider all the decompositions of the form $(s) =¢,(s)¢2(s) --- 
@,(s) where 


f(s) 
6:0) = (1 
f;(s) and g;(s) are real polynomials and the degree of f; does not exceed 
the degree of g;. The g; are quadratic polynomials, except when the 
degree of g is odd; then one g; 1s linear. 
Let L be a passband region for ¢, where Z is a finite union of passband 
intervals. Then for every ¢; , a number d(¢;) is defined by 


Max | $:(jw) | 
d(p:) = 20 logio “Min leGa T (2) 
wel 
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Also let 
d= Max d(d,). (3) 


t=1,..05 
Then d is a function of the decomposition. We present a procedure 
that determines the decomposition(s) with a minimal d. EX. Lueder 
proposed this optimality criterion.’ 


II. METHOD 


First, we artificially equate the number of zeros [zeros of f(s)], and 
poles [zeros of g(s)], by adding a suitable number of ‘zeros at infinity” 
corresponding to constant unit polynomials. Next we make this mutual 
number even by adding a zero and a pole at infinity, if necessary. In 
this way we get, say, 2¢ zeros and 2¢ poles. 

Pairing two zeros creates an f; ; a real zero can be paired with any 
other real zero, while a complex zero must be paired with its conjugate 
in order to get a real f;. The same is true for creation of g; by pairing 
of poles. 

In the following we assume that all poles, except perhaps one, are 
complex and therefore fixed paired. We call the real zeros which are 
not fixed paired free zeros. 

Next we make all possible pairings of the free zeros. Each such 
pairing, together with the fixed pairing, decomposes f(s) and g(s): 


f(s) = fils)fe(s) +++ f(s); 
g(s) = gi(s)g2(s) «+ g(s). 
Then we compute the matrix D = (d;,), where the elements 


a a4) (4) 


are computed from definition (2). The element d;, represents the ‘‘cost”’ 
of matching zero-pair 7 with pole-pair k. 
An assignment 1s a feasible set of matchings. Using the Bottleneck 


Assignment Algorithm, we determine an assignment k, , --- , k, for 
which 


Max dik; 
t 


will be minimal. We call this minimum the optimal d value for this 
pairing of free zeros. Going through all the possible pairings of free 
zeros, we find an optimal pairing which yields the smallest optimal 
d value. 

Since an optimal assignment (for a given optimal pairing) is usually 
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not unique, procedures for obtaining all the optimal solutions (assign- 
ments) or a nested solution are given. A nested solution is obtained by 
taking an optimal solution, fixing the matching with the largest d 
value, and then proceeding to look for an optimal assignment for the 
remaining ¢—1 f,’s and t—1 g,’s, and so on. 


III. THE BOTTLENECK ASSIGNMENT ALGORITHM 


This section discusses the Bottleneck Assignment Algorithm and its 
adaptation to the present problem. 

Let U = (u,;) be a real t X ¢ matrix. A matching is an ordered pair 
of integers (7 , 7.) 1Si,St, 1S7,St. We associate with the matching 
(t, , J.) the corresponding cell in U. The element in this cell u,; is 
called the cost of the matching. 

Aset A = {(t,,7,); & = 1, --- , ¢} of t matchings (cells) is called an 
assignment if in every row and in every column of U there is a cell that 
belongs to A. The bottleneck assignment problem looks for an assign- 
ment which minimizes the maximum of its matchings’ costs. 

The Gross algorithm’, is based on the following iterative step: 


(*) An assignment A and a real number a, which does not exceed 
all the costs of A, are given; then either a new assignment A’ is con- 





A=ARBITRARY ASSIGNMENT 
a@ =MAXIMAL COST OF 
MATCHINGS OF A 
APPLY THE ITERATIVE STEP (*) 











Fig. 1 — Flow chart for solving the bottleneck assignment problem. 
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structed, so that « exceeds more costs in A’ than it does in A, or it is 
established that no such A’ exists. 


The flow chart in Fig. 1 solves the bottleneck assignment problem. 
This algorithm is fast and requires small memory space, since only the 
present assignment must be stored. 


d-co A={(1y1), (2,2) reeeeeeeeny (t,t)} 
FORM AN UNCHECKED PAIRING 
OF THE FREE ZEROES 
U=D MATRIX OF THE PRESENT PAIRING 


a= MIN (d, MAXIMAL COST OF MATCHINGS OF A) 


APPLY THE ITERATIVE STEP (®) 











STORE THE PRESENT PAIRING 


ARE 

THERE 

UNCHECKED 

PAIRINGS LEFT 
? 









Fig. 2— Flow chart for finding the optimal pairing and its optimal d-value. 
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The basic algorithm was modified to find the optimal pairing of the 
free zeros and its optimal d-value (Fig. 2). Note that the value of a in 
every iterative step (*) does not exceed the minimum of the optimal d 
values of the checked pairings. Thus any pairing that does not reduce 
the d value already obtained is immediately disregarded. 

Also, for each pairing we use the optimal assignment of the preceding 
pairing, as an initial assignment. Thus the costs of the initial assignment 
matchings that correspond to the fixed paired zeros do not exceed the 
current d value. These procedures considerably reduce the amount of 
computation required for finding the optimal pairing and its optimal d 
value. 


IV. CREATION OF THE NESTED SOLUTION 


Let U denote the cost matrix which corresponds to the optimal 
pairing. The nested solution is created by successively applying the 
bottleneck assignment algorithm ¢ times and modifying U each time 
in such a way that the matching with the largest cost becomes fixed 
and irrelevant in the further computations. 

Let (2% , 7%) be the matching with the largest cost at a certain stage. 
Then (2% , 7%) becomes a part of the nested solution. U is then modified 
as follows: 


U...,= © forall s¥ 7; 
U2 = © forall s + 7; 
O ie bec oe 0. 


It is easy to verify that this modification has the properties described. 


V. A COMPUTATIONAL METHOD TO GENERATE ALL THE OPTIMAL 
ASSIGNMENTS 


Let U denote again the cost matrix which corresponds to the optimal 
pairing, and let d* be the optimal d value. We call a cell (2, 7) admissible 
if u;; S d*. The problem of listing all the optimal assignments then 
becomes the problem of listing all possible assignments that use only 
admissible cells. Using the flow chart of Fig. 3 can accomplish this. 
The number of operations can be seen to be dependent on the order of 
the columns of U. The dependence is quite complicated. However, a 
good rule of thumb for reducing the number of operations is to re- 
arrange the columns in ascending order according to the number of 
their admissible cells. 
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LO) =021 (2) =0, 4s 5. L(t) =0 








IS 
I(j)+ J) 
ADMISSIBLE 
? 





PRINT OUT THE ASSIGNMENT 


(0G) 1) (2,2) eens 1), 4) 


Fig. 3 — Flow chart for generating all optimal solutions. 
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Measured Quantizing Noise Spectrum for 
Single-Integration Delta-Modulation 


Coders 


By R. R. LAANE 
(Manuscript received October 14, 1969) 


We give expervmental verification, for 1dle-channel and sinusoidal inputs, 
of a recently developed quantizing noise theory for asymmetrical, single- 
antegration delta-modulators. 


A recent paper by Iwersen described a procedure for calculating 
quantizing noise for single-integration delta-modualtion coders employ- 
ing unequal positive and negative integrator step sizes." The purpose of 
this note is to provide experimental verification of the theory. 

Measured quantizing noise for both idle-channel and sinusoidal inputs 
is given and the idle-channel noise spectrum is calculated. 

Defining the positive, o, , and negative c_ , integrator step sizes as 


OC, =orte 
g.=--—-ot+e (1) 


where o is the average step size, an error wave is generated by the 


oF 

+o yo 
t 2E 
Ww 
5 Oo 
F 
J 
a 
= 
< 

~-~o o- 

i/fs 
TIME — > 


Fig. 1— Asymmetrical integrator output for an idle-channel input, shown for 


lo, [> |o-|. 
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Fig, 2— Observed idle-channel noise spectrum, f; = 1.56 MHz, 9 = 0.0937. 
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Fig. 3— Expanded idle-channel noise spectrum: (a) 0 — 100 kHz, (b) 100 — 
200 kHz. 
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integrator for idle-channel inputs as shown in Fig. 1. The quantizing 
noise spectrum resulting from the error wave is a line spectrum, and the 
line frequencies, f, , for a one-sided spectrum from zero to one-half the 
sampling frequency are given as a function of the integer index l by’ 


fi = | Qld — 8)/2If, | (2) 
where 
Q(a) = a — N(a), 
N(a) = integer nearest a 


and # is the integrator step imbalance e/a and f, the sampling frequency. 
The power at the frequency of index / is calculated from 


P, = 20° / xl’. (3) 


The resulting noise-spectral lines will subsequently be referred to as 
l-lines (1-line, 2-line, 14-line, and so on). 
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Fig. 4 — Calculated idle-channel noise spectrum, f, = 156 MHz, 3 = 0.0937. 
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Measurements of the quantizing noise spectrum were made using 
a delta-modulation coder designed for telephone switching applications.” 
A 1.56 MHz sampling frequency and an average integrator step size 
of 18 millivolts were used for the measurements. Figure 2 shows the 
observed idle-channel spectrum of the coder for a frequency range from 
0 to 1 MHz. The region near the 2-line is expanded in Figs. 3a and 3b 
where the noise spectrum is shown for frequencies from 0 to 100 kHz and 
100 to 200 kHz respectively. 

The calculated spectrum from 0 to f,/2 (0 to 0.78 MHz) is shown in 
Fig. 4 for } = 0.0937. Excellent correlation can be observed with the 
measured spectrum in Tig. 2. For a more detailed comparison, Table I 
gives the calculated and measured frequencies and powers of the /-lines 
for the band from 0 — 200 kHz. With respect to frequency, the agree- 
ment is within experimental error. However, measured peak powers of 
higher order J-lines fall below the calculated values. This discrepancy is 
believed to be due to modulation broadening of the lines by a low-level 
noise input of unknown origin. 

Figures 5a and 5b show the effect of sinusoidal inputs on the coder 
noise spectrum. As suggested by Iwersen, inputs to the coder phase- 
modulate the idle-channel lines and force the frequency band occupied 
by each /line group, Af, to become proportional to the slope of the 
input signal, 2rAf, , and to the index of the /-line,’ 


Af = QlAfy | (A) 


where A is the amplitude and f, the frequency of the input signal. This 
is illustrated in the figures where broadening of the 1-line, 2-line, 3-line 
and 4-line as a function of signal amplitude is clearly visible. 


TABLE I—CoMPARISON OF MEASURED AND CALCULATED NOISE 
SPECTRUM FOR 0 — 200 kHz 


Measured Calculated 
l-line fi P; fi Pi 
2 146 kHz —48 dBm 146 kHz —48 dBm 
9 121 —61 121 —61 
11 24 —66 24. —63 
13 171 —67 170 — 64 
20 98 —69 98 —66 
22 48 —72 48 —69 
24 194 —74 194 —70 


31 73 —75 74 —72 
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Fig. 5— Effect of 1 kHz sinusoidal inputs on noise spectrum; (a) —40 dBm 
input, (b) —30 dBm input. 


Additional discussion of the noise characteristics as well as a de- 
scription of the design of the delta-modulation coder will be presented 


in a future paper.” 
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Use of the Discrete Fourier Transform 
in the Measurement of Frequencies 
and Levels of Tones 


By D. C. RIFE and G. A. VINCENT 
(Manuscript received May 7, 1969) 


This paper considers the application of a digital computer and discrete 
Fourter transform (DFT) techniques to the measurement of signals known 
to comprise only single-frequency tones. We discuss the use of weighting 
functions to wmprove the effective selectivity of a measurement system that 
estumates the frequencies and levels of tones from the coefficients of thetr 
DFT. We present three classes of weighting functions which may be used 
to amprove the inherent accuracy of such a system. The form of the weighting 
functions was chosen to minimize the amount of computer memory required, 
without using too much computer time. Several formulas are derived for 
estumating the frequency and level of a tone from its DFT coefficients. We 
chose the formulas to minimize computation time. 

Simulation results indicate that, through the use of a proper weighting 
function, a DFT measurement system that uses 512 samples taken at a 
sampling frequency of 7040 Hz can be designed so that the maximum error 
im the frequency estimates of two tones near 1000 Hz and separated by 
approximately 50 Hz 1s about 0.03 Hz. The corresponding maximum error 
wn the level estimate zs on the order of 0.03 dB. 


I. INTRODUCTION 


There have been numerous articles, in recent years, dealing with the 
use of the discrete Fourier transform (DFT) in the area of spectrum 
analysis. Much of this interest was motivated by the availability of a 
computational algorithm that facilitates the rapid computation of 
DFT coefficients by a digital computer. The algorithm is, of course, the 
fast Fourier transform (FFT). 

We are concerned with the problem of applying DFT techniques to 
the measurement of the levels and frequencies of single-frequency tones, 
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S(t)o LOW-PASS A/D DIGITAL OUTPUT 
FILTER CONVERTER] |COMPUTER DISPLAY 


Fig. 1— A DFT measurement system. 


particularly tones from a data set during a test. Figure 1 shows the 
system we have in mind. A band-limited received signal, known to 
comprise one or more single-frequency tones, is periodically sampled by 
an A-D converter. A total of N samples are taken and the DFT co- 
efficients are computed from the samples. The computer determines 
which of the DFT coefficients are “large”, indicating the approximate 
frequencies of the received tones, and then proceeds to compute accurate 
estimates of the frequencies and levels. Methods for achieving the first 
part of the procedure are well known. This paper is devoted to a con- 
sideration of how best to go about the last step in the process, the ac- 
curate estimation of the frequencies and levels of the received tones. 

In data set testing, the tone measurement system would be used 
occasionally during a test and would have to consume a minimum 
amount of real time. Thus we have directed our attention toward estima- 
tion methods that use simple formulas and require a minimum amount 
of computer memory. 

Our attention is confined to the problem of leakage, its reduction by 
smoothing (windowing) functions, and the development of formulas 
which extract tone levels and frequencies from the list of DFT coeffi- 
cients. We don’t discuss the important, but secondary, problems of 
round-off errors and other noise sources. 


Il. REVIEW OF DISCRETE FOURIER TRANSFORM 


The definition and properties of the discrete Fourier transform are 
discussed in Refs. 1 and 2. The following review is to refresh the reader’s 
memory and establish the notation that we will use later. 


2.1 Definition of Discrete Fourier Transform 


Consider an ordered set of numbers {X,} where n = 0, 1, 2, --- , 
N — 1. Following Cochran, and others,” we define the discrete Fourier 
transform (DFT) of the set {X,} to be another set of numbers, {Ax}, 
with 


N-1 


Anr= > Xe 7?"*", allinteger K. (1) 
n=0 
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The inverse transformation 1s 


N-1 


A = as > Agerr*’", on = 0,1,2,°°-,N— 1. (2) 
K=0 


2.2 Useful Properties 


Several properties of the DFT are utilized in later parts of this paper. 
The important properties are recorded in this section for future ref- 
erence. Reference 2 provides a more complete list. Derivations are 
included only for results that may not be well known. 

From equation (1) it is obvious that if the X, are real, then 


A_zx = Az  (* denotes conjugate), (3) 
Axsy ac Ax, (4) 
and 
Ay—x = A_rx — As. (5) 
2.2.1 Convolution 
Let 
N-1 ; 
Bx ae > Ne (6) 
n=0 
and 
N-1 
Ce = » Ce, (7) 
n=0 
then 
N-1 1 N-1 
Am = >» X,Y = D-Boy . (8) 
n=0 IN eae 


In other words, if {Bx} and {Cx} are the DFT of {X,} and [{Y,}, 
respectively, then the DFT of {X,Y,} is given by equation (8). 
2.2.2 Power 


It can easily be shown, for X, and Ax defined by equations (1) and 
(2), that 


1 N-1 Ps N-1 . 
W Ay Axdx = 2d, Xn (9) 


If the X, are samples of some function, f(); that is, if X?2 = f*(nT/N), 
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then 
fig eee "2 

lim — yx = f(@ dt 
N 3 ; 


N00 


if the integral exists. Thus, for large N, 


N-1 
[ roaee Ux, (10) 
Hence, from equation (9), 
1 T , 1 Xo ; 
al PO dtm aa De Ac. (11) 


2.3 Relationship to Fourier Transform 

The DFT of samples of a signal has a simple relationship to the 
regular Fourier transform of the signal. It is instructive to examine this 
relationship.! 

Let g(t) be an arbitrary function, zero fori < 0 and é > T and con- 
tinuous over 0 < ¢ < T. The function is allowed to be discontinuous 
at? = Oand at ¢ = T. Assume that g(0+) and g(T'—) exist. 

A well-known application of the Poisson sum formula gives* 


90+) + 29(P-) + Dole) =% y o?BX) aay) 
where 
Gu) = I glow at. (13) 


Adopting a notation similar to that of Papoulis,* we define the ‘“#”’ 
operation by 


GC") = 7 Gw — Kw,), (14) 
where 
w, = IeN/T. (15) 
Then equation (12) can be rearranged to give 
(nT ‘ : 
a o( 2) = G'(0) + 4[90+) — g(T-)], (16) 


where g(0) is taken to equal g(0+). 


+ The recent article by Bergland touches upon this subject and also contains 
an extensive list of references.? 
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Let A(t) be any function of the sort used above for g(¢) with the ad- 
ditional property: h(0O) = h(O+). 
Let s(¢) be the signal to be analyzed and define 


{@) = s@ht). (17) 
Let g(t) = f(t)e"’** and define 
A nT) -; 
Af) = >> {BE Norimere (18) 
n=0 N 
Then from equation (16) and the definition in equation (14) we have 
Aw) = F¥(w) + HfO+) — f(T—-)e"*"), (19) 
where 
f . 
Fw) = fie i" dt, (20) 
0 
If X, = f(nT/N) then the Ax defined by equation (1) are given by 
Z 
4x AC) en 


Thus the DFT of the set {f(n7'/N)} are points along the curve described 
by equation (19). These points are 1/T Hz apart. 

Observe that at w = 2rK/T the term in brackets in equation (19) 
becomes 3/f(0+) — f(7—)] which is independent of K and vanishes 
if f0+) = f(T). 


2.4 Weighting Functions 


If the DFT is to be taken of the set {s(n7/N)} for n = 0 through 
N — 1, then h(é) must be a function whose value is unity att = nT/N; 
n =0,1,---,N — 1. The function with this property that 1s usually 
taken to be A(#é) is the function h-(A); 


1 0s t<T; 
ho(t) = . a (22) 
0, otherwise. 


Other weighting functions, h(), are often formed by multiplying h(t) 
by a nontime-limited function. Weighting functions play a very im- 
portant role in systems that use the DFT. The following paragraphs 
attempt to develop and present some of the pertinent theory. 
From equation (19) we see the role that F*(w) plays in A(w). Since 
{®) = s@hO, 
F(w) = S()*H(), (23) 
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where the * denotes convolution and S(w) and H(w) are the Fourier 
transforms of s(t) and h(t). It can be shown that, subject to the usual 
convergence constraints, 


F"(w) = [S@)*H()]" = S@)*H*(@). (24) 


Thus H(w), or equivalently H*(w), plays a central role in the DFT of 
(weighted) samples of s(¢). From the development that led to equations 
(19) and (21), we see that, if h(0+) = A(T—), the DFT of samples 
of h(t) is a set of points taken along the periodic curve described by 
H* (w). It follows, therefore, that the values of h(nT'/N) can be obtained 
from 


nT\ | 1S +(20k) j2Kn/N 
n(E) = 2 yt (Pak pera (25) 
Also, 
N-1 nT : ; 
H*@) = > n (RE |_rinenem —~ $h00+)[1 — e***]. (26) 


Weighting in the time domain is actually done at the points t = 
nT/N;n = 0,1, 2,---,N — 1. For every set of weights to be applied 
at these points there exists a continuous function with the same values 
at the indicated time points. Thus there is no loss of generality due to 
discussing weighting in terms of weighting functions, h(t), that are 
continuous over (0, 7’) and zero outside that interval. We have to re- 
member, however, that if the set {h(nJT/N)} is specified, h(é) is not 
unique. Thus, if H* (w) is given, h(nT'/N) is given by equation (25), but 
h(t) and H(w) are not uniquely defined. 

There is apparently some confusion in the literature about whether 
H(w) or H*(w) is called a weighting function (or windowing function). 
Blackman and Tukey,’ for example, discuss A(t) and H(w), but when 
Helms’ writes about weighting with a Dolph-Chebychev function, he 
is evidently referring to H*(w). More will be said about this later. 
Bingham, and others, in writing about data windows (See Reference 7, 
Part VII) mean h(t). 

Observe that H*(w) is always periodic with period w,, while H(w) 
is not periodic. (If it were, H*(w) would not converge properly.) Gen- 
erally the H*(w) that one uses will have a prominent main lobe about 
w = Ka, (K is any integer, including zero) and many side lobes. For 
our purposes it is important to obtain a narrow main lobe and low- 
amplitude side lobes. 

The class of H*(w) with the minimum main-lobe width for a given 
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side-lobe amplitude is known as the (discrete) Dolph-Chebyshev weight- 
ing functions.® A convenient form, similar to the one given by Helms,’ 
but changed to describe the result of weighting by sample values that 
peak at 7T'/2 and are adjusted to cover approximately unit area as later 
weighting functions will do, is the following: 


# = N —-jwT/2 i = ( a) 
H"(w) = R ° cos | N cos ~ |Z, cos oN (27) 
where the side-lobe amplitude, 1/R, is related to Z) by 
R = cosh (N cosh *Z,) (28) 


and N is the same as used in equation (1). 

The class of H(w) with the minimum main-lobe width for a given 
side-lobe amplitude is known as the continuous Dolph-Chebyshev 
functions,’ which are unrealizable. The Taylor approximations to the 
continuous Dolph-Chebyshev functions’™”* are realizable, however, and 
provide almost the same main lobe width for a given maximum side-lobe 
amplitude. 

The problem of choosing ‘‘good’’ shapes for H” (w) can be approached 
by treating H” (w) or by treating H(w). Most of the well-known weighting 
functions are discussed in terms of H(w) or A(t). 


2.5 A Generalization 


If h(é) is a function that is zero fort < T, andt > T, then it can be 
shown (sampling theorem) that H(w) is given by 


Hw) = Te#*?” sin @P/2) = (29) 
a eee 
5 
and 
TC, = (22). (30) 


Thus the specification of a weighting function is equivalent to the 
specification of the constants, C’, . 


Ill. SELECTED WEIGHTING FUNCTIONS 


3.1 Leakage and Aliasing 


Leakage will be used here to refer to the problem of the values of 
A(w) due to cos (wot + 69) interferring with the values of A(w) at some 
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other frequency, say ,, where the response due to cos (w,é + 6;) is 
to be examined. Leakage, in our system, is minimized by the use of 
weighting functions. 

Aliasing refers to the fact that in a sampled-data system tones with 
frequencies above w,/2 cannot be distinguished from tones with fre- 
quencies less than w,/2. In our system aliasing is avoided by the use 
of the low-pass filter (Fig. 1). 


3.2 Convolution of Weighting Functions 


The object of weighting is to produce the DFT of a weighted set of 
samples of the signal undergoing measurement, s(t). Thus we seek to 
compute 


N-1 
An = dO s(nt)antje?"’”, forall K; (31) 
n=0 


- where ?, = T'/N. A convenient way of doing the weighting is to first 
compute 


N-1 


Bes dD snije?'*, (32) 
n=0 
forO < K S N — 1. Then if the set {Z,,}, 
jie ; 2rm 
Hy = Sohne = H*(2t), (33) 
n=0 


is stored in the computer, the Ax can be computed from equation (8). 


3.3 A Special Class of Weighting Functions 


The amount of computer memory required to store the set {H,,} 
will be small if h(¢) is a function such that H,, = Ofor MZ <|m| Ss N/2 
and WM is arelatively small number. The H(w) corresponding to this class 
can be expressed by a particular form of equation (29): 

Af C 
Hw) = Te'*sn X DY) >, (34) 


n=—M — 10 


where 
X = wT'/2 (35) 


and M <« N/2. We have restricted our attention to the results that can 
be obtained with this class of weighting functions. 

Most of the well-known weighting functions, such as Hanning,’ 
Hamming,’ and Taylor’’’”’ are in the class defined by equation (34). 
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The discrete Dolph-Chebychev and the Kaiser—Bessel’* weighting 
functions, however, are not. 

The right side of equation (84) can be written over a common de- 
nominator to obtain the form 


4 sin X P(X) 


D'¢ M ) 
I] x __ nx”) 





Hw) = Te (36) 


where P(X) is, in general, a complex polynomial in X. We will restrict 
our attention to H(w) with real C, and Cy) = 1. In which case C_, = 
C,, and, if D, = 2C,, we have 


h(t) = h(O| + > D,, cos (2a!) |. (37) 
Equation (34) becomes 
Hw) = Te™'* sin x|4 -+- yg ane -PX_,|. (38) 


In the next few sections we will discuss three classes of weighting 
functions with the form of equations (37) and (38). They were chosen 
to provide two extreme cases of weighting and an intermediate example. 
Many other weighting functions in the class defined by equations (37) 
and (38) exist; the ones examined below provide sufficient data for our 
purposes. 


3.4 Class I Weighting Functions 


We first consider the class of weighting functions that provides the 
best possible reduction in | H(w) | for large w. Let this class be known 
as Class I. 

The only part of equation (36) that can be adjusted is the polynomial, 
P(X). Thus we must choose the coefficients, D, , to minimize | P(X) | 
for large X. This is done by forcing P(X) to be a constant. The constant 
term in P(X), from equation (34), is 


PO) = (—1)" 9°" (M))’. (39) 


Hence, the desired class of weighting functions has the form, from 
equation (36), 





pee SX (= 1) “MN 


(40) 
4 M 

I] ‘x? ae nr? 

n=1 


Hy(w) = 
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We denote the coefficients, D, , of this class of weighting functions 
as D;(M, n), making the dependence upon JM explicit. From equations 
(36), (88), and (40) the D;(M, n) are given by 


f= 1)"6 uy eq es igo 


DM, n) = lim (41) 
ae Xe llx- Ken 
K=1 
Kvaluation of the limit and some simplification gives 
= n ! 2 n — 
Gh re 2. epg Ie, 5) 


(M —n)!(M +n)! ki M+K 


We denote the weighting functions that use equation (41) as ha,(t). 
Then from equation (37) 


hart) = ho( ols 1+ : D,(M, n) cos aunt | (43) 
This can easily be shown to be the same as 


1.0 


0.8 





Fig. 2— Spectra of the Class I weighting functions. 
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hart) = h(t) aay sin?” (=!) : (44) 





Thus the Class I weighting functions are described by equations (40) 
and (44). The so-called hanning weighting’ is equivalent to h,(é). Larsen 
and Singleton’ used h,(é), h2(t), and others. 

Fig. 2 shows the shape of (1/T)e7°"”"H,(w) for M up to 4. In Fig. 3 
we have plotted the normalized transmission of Hy,(w) (that is, 
20 Logio (Har(w)/T|). Several of the hy(t) are shown in Fig. 4 and 
some values of D,;(M, n) have been tabulated in Table I. 


3.5 Class II Weighting Functions (Taylor) 


Class I weighting functions provide the minimum high-order side- 
lobe amplitude in H(w) that is possible with a given value of M. We 
now turn to the class that gives the minimum main-lobe width, at the 
expense of higher side-lobe amplitude. 

The so-called continuous Dolph-Tchebycheff weighting functions” 





Fig. 3— Normalized loss of the Class I weighting functions. 
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4 





Fig. 4 — Time response of the Class I weighting functions. 


provide the minimum main-lobe width in H(w), consistent with a 
specified maximum side-lobe amplitude, but they are unrealizable 
functions. The Taylor’’ approximation to the Dolph—-Tchebycheff func- 
tions provides almost the same main-lobe width and side lobes that have 
the specified maximum amplitude near the main lobe and then gradu- 
ally decrease as w increases. 

The Taylor functions have the form given by equations (87) and 
(38) with the D, dependent upon M and the maximum side-lobe ampli- 
tude, 1/R. We will denote the D, coefficients of Taylor weighting by 
Di (R, M, n), making the dependence upon FR explicit. After adapting 
Taylor’s equations to our situation, the D,;’s are given by 


M ay 
I] E ~ 42 ae = ok | 
Dyk, M,n) = _kail +E 2) 








a-G@] 
i PNG 
where 
R = cosh (md) (46) 
and 
pe Mt (47) 


ee CE Aa): 
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TaBLE I —Vauuss or D;(M, n) ror M ur To 4 


nr 
M 1 2 3 4. 
1 a1 — — 
2 —4/3 1/3 os 2 
3 —3/2 3/5 —1/10 — 
4 —8/5 4/5 ~8/35 1/35 
Solving equation (46) for \ gives 
\= In a a/R? = 1). (48) 


We will refer to the Taylor functions described by equations (45) 
through (48) as Class II weighting functions and denote them by 
k,(R, t) and Ky,(w). References 10 and 11 give a further discussion of 
Taylor functions. Taylor weighting functions have the property that, 
if Mf is too small, the D’s given by equations (45) will define an H(w) 
whose first few side lobes have the amplitude given by equation (46), 
but some of the higher-order side lobes will have much higher ampli- 
tudes. Thus, for each value of desired side-lobe level, 1/R, there is a 
minimum value of M that will give good side-lobe suppression. 

Some minimum values of M that give good side-lobe control are 
listed in Table IT. 

Figures 5 and 6 show the shapes of a Class II weighting function with 
M = 7 and R = 10°. This particular weighting function will be ex- 
amined below when simulation results are compared. It will be shown 
there that this weighting function is useful when the received tone 
frequencies are very closely spaced. 


3.6 Class III Weighting Functions 


The third class of weighting functions has been chosen to have, to a 


TasBLE IJ—Mintmum VALuEs oF M ror Goop SipE-LOBE CONTROL 


20 logio R (db) M 
36 3 
42 4 
48 5 
54 6 
60 7 
66 9 
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Fig. 5 — Normalized loss of the Class II weighting functions. 


large extent, the desirable properties of both Class I and Class II 
weighting functions. That is, Class III weighting provides better reso- 
lution than Class I weighting for tones with a “small” frequency 
separation. Moreover, they also provide better resolution than Class IT 
weighting of tones with a “‘large’”’ frequency separation. 

We will identify the Class III weighting functions by g,(¢) and 
Gyr(w), where ga, (¢) < Gy,(w). The D, coefficients for this class will be 
denoted by Dy;:(//, n). The first member of the class is chosen as dis- 


Fig. 6— Time response of the Class II weighting functions; M = 7, R = 1000. 
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cussed below and the other members are obtained by operations on the 
first. 

In order to have the high-order side lobes of Gy;(w) fall off at least 
as 1/w” we must have g,(0+) = gy(7—) = 0. In terms of the D, 
values this means that 


I + » Dirs(M, n) = 0. (49) 


With this restriction, of course, g(¢) is the same as the Class I weighting 
function, h,(¢). Thus the distinguishing properties of Class III weighting 
are determined by g,(t). 

We chose the coefficients of g2(t) so that the loss of G.(w) reached 60 
dB with as small a value of w as possible with the side lobes of G2(w) 
never exceeding —60 dB, subject to equation (49). The D, values for 
this condition are: 


Dyr1(2, 1) = — 1.19685, Di (2, 2) = 0.19685. 


The g2(é) thus defined is almost the same as Blackman’s’ proposed 
function, Q.(f). | 

The rest of the members of the Class III functions are defined in a 
manner similar to that used by Helms’ for the synthesis of digital 
filters. We define 


oe 1+ 4 Din(2, 1) DM — 2,1) +4 Din2, 2) DM — 2,2)? 


M > 2. (50) 


The normalization in equation (50) puts g,,(é) in the form of equation 
(87). 

The Class III functions just defined have high-order side lobes, in 
Gy:(w), that decrease as w ”. This contrasts with w ‘“*” for Class I 
weighting and with w* for Class II weighting. Thus, Class III weighting 
functions provide slightly narrower main-lobe width than Class I at the 
expense of slightly higher side lobes. 

Some values of Dy31(M, n) are tabulated in Table ITI. 

In Fig. 7 we have plotted the normalized spectra (that is, 
(1/T)e*°"’Gz(w)) of some Class III weighting functions. Fig. 8 illus- 
trates the normalized loss provided by Gy,(w) for values of Mf up to 4. 
It is interesting to note, from Fig. 7, that G.(w) reaches —50 dB before 
any of the others, just as H.(w) did in Fig. 3. In Fig. 9 we have plotted 
gu (t) for values of M up to 4. 
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TasLE IJJ—Vatvss or Dy11(M, n) For M up To 4 


n 
M 1 2 3 4 
2 —1.19685 . 19685 = = 
3 —1.43596 .497537 — .0615762 — 
4 —1.566272 129448 ~~ .180645 .0179211 


IV. RESPONSE TO A COSINE WAVE 


We are interested in measuring the frequencies and levels of signals 
that comprise several sine waves. In view of this and the linearity of the 
DFT it is convenient to examine the properties of the DFT of samples 
of cos (wot + 6). 


4.1 Basic Formulas 


Let 
s(t) = cos Wot + 6) (51) 
{ = Ar() cos Wot + 4). (52) 
1.0 
0.8 
s 0.6 
= 
O 
ss 
© 04 
a a 
0.2 
0 
0 { Vs ae 4 
of 
oT 


Fig. 7 — Spectra of the Class III weighting functions. 
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Tig. 8— Normalized loss of the Class III weighting functions. 
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Fig. 9— Time response of the Class III weighting functions. 
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and 
Hes / ho(tei®* dt (53) 
or, in the usual notation, 
H p(w) - he(d). (54) 
Then 
F*¥(w) = Le H* (a — Wo) + Le PTE (w + wy) (55) 


and from equation (19) 
Aw) = $e" Ht(w — wo) + 3¢ "HT + wo) 
+- [eos 6 — cos (woT' + O)e77°’ J. (56) 
With h-(t) as defined by equation (22) 


-jo7/2 SIN (wT'/2) . 
wT" /2 
It is possible to put equation (57) in equation (56) and evaluate the 


indicated summations. The same answer can, however, be found more 
easily from equation (18). With equation (52) in equation (18) we have 


= won’ 
Aw) = >> cos ( W + g)erinern (58) 
n=0 


H,() = Te (57) 





or 
* N71 . . N71 ° 
A(w) ee 19? oe a Ag? e ilar eonT/N (59) 
n=0 n=0 


Both sums in equation (59) are finite geometric series. Thus after re- 
arrangement we obtain 


A(w) = dee N-Y? wR alecdg Ot ae ee (60) 
where 
z= eat (61) 
and 
re @ + ao)T" (62) 


2N 
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The formulation given by equation (56) is important from a con- 
ceptual point of view while that of equation (60) is useful for numerical 
evaluations. 

The evaulation of equation (60) when either z or y is zero requires an 
examination of limits, which can be done by inspection. 


4.2 General A(w) 


The use of one of the weighting functions defined by equations (84), 
(37), and (388) gives rise to an A(w) different from the one calculated in 
equation (60). The more generalized form of A(w) is given by 


genet N) 


M 
ee ee 
Aw) = See!" sin Nz D> C, ————~ 
n=—-M é ( > 
sin = = 
N 
i(y-nar/N) 


AL 
98 aN. s € 
+ se eet “sin Ny Ss C.. 
n=—-M . ( a) 
sin =) 
N 


All of the simulations, to be discussed below, used this A(w), in 
equation (21), to compute Ax values. 


(63) 


4.3 Approximations 


We will make use of several approximations in the next section. The 
important ones are established here. In this section h(t) is assumed to 
be the weighting function and w is in the range 0 < w < w,/2. 

Consider equation (19). If | F(w — lw,) | is small for 1 ¥ 0; then 


A(e) = FF) + MOH) = f=) (64) 
where 
it, = T/N. (65) 
Thus, from equation (21), 
Ag = 4 r(A2) 4 a4) — fer). (66) 


Next consider equations (56) and (57). | Hr(w) | is “large” only 
near w = 0. Thus we obtain another approximation, used when s(¢) = 
cos (wot + 8), 


F(w) & 3e!’H p(w — wp). (67) 
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From equation (57) we then obtain 


N i0,~it(o— wo) 7/21 Sin [Ww — wo) 7/2) 























l 
Z Fo) = 9% e (@ — o)T'/2| (68) 
Thus, 
fe (Koel = oT) | 
N 2N 2 
Ae So 69 
vee 2) (Kast ~ 
2N 2 
or, because, 
Kw, 
oN = Kr, (70) 
wy {sin (ix = af 
Ae a (71) 
7 | (x - vel) | 
a 


Irom equation (71) we see that, apart from the error in the approxi- 
mations, one should be able to accurately estimate the frequency and 
magnitude of a cosine wave from the Ax values for K near wol'/2r. 

The main lobe of a sin X/X curve reaches zero at X = -+-w. Thus the 
main lobe of the curve, of which the Ax are points, reaches zero at 
W = wo + 27r/T or, sincew,/N = 2r/T, at 


oO = w + w,/N. 


The main lobe is just wide enough to contain two Ax values (estimators), 
except if w) equals some multiple of w,/N. It will be shown later that 
two Ax values will be enough to estimate the parameters of a cosine 
wave. 


V. FREQUENCY AND LEVEL ESTIMATION 


In the preceding sections we have developed methods of computing 
the DFT coefficients of a known input (for simulations) and have dis- 
cussed three classes of weighting functions to improve the effective 
selectivity of the DIT process. Our final task, which is undertaken in 
this section, is to determine accurate ways of processing the DFT 
coefficients to extract the frequencies and magnitudes present in the 
sampled signal. 

The methods are to be useful when the real-time demands upon the 
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computer are important. Thus all of the methods were chosen to have 
simple formulas which the computer can be programmed to evaluate 
when it is making a measurement. 

The equations we have examined fall into two classes: those that use 
only two estimators to calculate the frequency and level of a cosine 
wave and those that use many. The following paragraphs derive the 
most promising of these equations. The next section will present the 
accuracies that the various equations can achieve. We start with a 
formula that makes use of many estimators. 


5.1 Method 1 


The derivation of this method is somewhat involved, so we first 
explain the motivation behind it. 
Suppose one has an f(t) that is known to be given by 


f(t) = B cos aol, (72) 


and one wants to determine w, and B from operations on f(t). One way 
to determine B is from 


ee ee ie ee ae ee 
lim | fo at = tims | B?/2 dt 


= B*/2 (73) 
thus, 
1 T 
=o i: P(t) dt. (74) 
T— 0 T 0 
The derivative of f(¢) is 
f’(é) a — Bod sin Wob (75) 
and 
* 1 > 2 2 2 
lim = | [f'(@J? dt = Brat /2 (76) 
T0 T 0 


Thus w,) can be determined from 


ee | sd 

lim = ole dt 
es fan), Wore (77) 
lim 5 i f(t) at 


The above result is the motivation for the following derivation. 
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Assume 
s(t) = B cos (wot + 8) (78) 
and 
f(t) = h@®B cos (wot + 8) (79) 


where h(t) is one of the weighting functions, given by 


h(t) = E + > D,, cos eal h,(t). (80) 


Let the estimators, Ax , be given by equation (31). 
Define 


T 
Py = re i [f()]° dt (81) 
to L' Jo 
and 
1 T 
Py = lime ff (oy at. (82) 
T90 T 0 
Expansion of equation (79) gives 


f®) = Bho()\c0 (wot + 6) 


+- ; >, D,,[cos (wot — nwt + 0) + cos (wot + nwt + ont (83) 
where 
w, = Q@r/T. (84) 
From equations (81) and (83) 
B’ 1 Mf ‘ 


Calculation of P, from equation (83) yields, assuming nw, ¥ wp, 
Phe M n° ae 
P, = at + > Pat =a + SD, (86) 
n=] 
Combining equations (85) and (86) gives 


M 
p Ww, >> nD? 
wo = > —- —. (87) 
2+ >) D 


n=l] 
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The next step is to compute P,/P, from the DFT of f(¢). Using the 
approximation 


Rigi tys 8 ee 
f(x ne os Ane" (88) 


which can be rearranged to give 


{®) & Mee ‘4, +2 pS [Re (Ax) cos Kw,t — Im (Ax) sin Ket 


(89) 
From equation (89) 
1 N/2 
Pym ta Ai +2 2D | Ax | (90) 
and 
1 N/2 
Py wa 22 | Axl oe (91) 
Thus, 


N/2 
Ww. > K? | Ax |? 
wy At ___.. (92) 


N/2 


2 
f04 | Ag 
K=1 


We use equation (92) in equation (87) to obtain the final result. 
Denote the estimated w) by 4). Then 


N/2 
| Sear | 
= Ww, a — Ux (93) 
Ag + DU | Ax |’ | 
=1 


a ta 


where 
M 
nD? 
Oy = aN ee (94) 
2+ >) Dt 
n=l 


By D, we mean, of course, D;(M, n), Di (R, M, n) or Dyy(M, n). Thus 
Method 1 is applicable to all three classes of weighting functions. Some 
values for Uy, are tabulated in Table IV. 
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TaBLE IV—VALUES OF Uy AND V xy 


Class I Weighting 


0 0 of) 

1 .9dd000 75 

2 .571429 972222 
3 .818182 1.155 

4 1.06667 1.31327 


Class II Weighting 
(M = 7, R = 1000) 

U, = .857071, Vz = .769710 
Class III Weighting 


M Un Var 
2 .45732 .8678 
o .715523 1.07833 
4 . 968949 1.25033 


The way to use equation (93), when more than one tone is present 
in s(t), is to use only the estimators, Ax, for K % w/w, to calculate 
each 6. The simulations below will show that this technique gives 
accurate results. 


5.1.1 Estimation of Level 


From equations (85) and (90), we obtain the way to estimate B. 


N/2 
; , 40+ 2 bs [| Az? 
fg og (05 
where 
Af 
Vu =a+i 2 Di. (96) 


Some values of V,, are tabulated in Table IV. Observe that if only 
the basic weighting, h(t), is used, then Uj, and V;, become zero and 
4, respectively. 


5.2 Method 2 


The preceding formulas for estimating the frequency and level of a 
cosine wave from its DFT use more than two estimators. The next few 
paragraphs establish formulas that require only two estimators. The 
formulas apply only to the Class I weighting functions, /y,(t). 
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We start by recalling the approximations given by equations (64) 
and (67). With H,;(w) substituted for H,(w) in equation (67) we have 


i. 
A(w) wd) of. eV" Hur at Wo). (97) 
From equation (40) 
ee 4) 2M ay V2 
ee 2X) _ T oni sin X (—1)" ar" (13) (98) 
T ae AM 
a ee 
[] (© - wa 
n=1 


where X = wT7'/2. If s(t) = Bos (wot + 6) then, from equations (97) and 
(98), 


| Ale) |e | (-"ary? 5 22 — (99) 
T a8 (v + n) 
where 
v= (wo — w)/w,. (100) 


Suppose the largest! estimator is A, and its largest immediate neighbor 
A,, . Of course m — 1 = +1. Define 


a=m—1= +1. (101) 
Let 
a, = | A, | (102) 
and 
dz = |An| = | Arse |. (103) 
Define 
reo — |» =1/2 sus 1/2. (104) 


Then from equation (99) 


o. i 
7 I] w+n) 


n=—M 


a, & (-1)"(M)) (105) 


+ By largest we mean | A; | > | Ax| for K 541. 
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and 
ay ee (— 1)" y? BY sm mtu — a) (106) 
T IT u+n—a) 


since a = +1, equation (106) is the same as 


a & —(—1)"(uP es oe ae (107) 
« I] @w+n) 
n=—-M-—a 
Division of equation (105) by equation (107) gives 
a, w-—a(M +1) 
a (108) 
Define 
_ a(M +1) — aM 
Ut; = a +o ae as (109) 
then from equation (108) an estimate of u is 
tu = AU). (110) 


Hence, the estimate of w, is given by 
by = w.(l + 2). (111) 
From equation (105) the estimate of B is 
M 
a,20(—1)" [J] G@—n) 
- n=— Af 
——W)'sinrd oy) 


where @ is defined by equation (110). 
Another version of equation (112), better for machine computation, is 


B= xaiven Ht CG) a3 


Method 2 with M = 0 is essentially the same as was derived by 
Penhune and Martin’* to solve a radar problem. 


5.8 Method 3 


The simplicity of the estimation equations of Method 2 led us to 
extend this method to include any class of weighting functions described, 
in general form, by equations (37) and (88). We will refer to this more 
general. method as Method 38. 
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From equations (109) and (110) we see that if a Class I weighting 
function is used, one way to obtain an estimate of u is to use the function 


Cay = Da, 


eee: (114) 


i, = 
in equation (110). There are three degrees of freedom in the bilinear 
form and we chose to express them in the manner shown in equation 
(114). 

Method 8 is simply the application of equation (114) to other classes 
of weighting functions. We obtained values for the coefficients in equa- 
tion (114), for several weighting functions by: 


(i) Computing a, and a,, using equations (21), (63), (102), and 
(103), with 6 = O and many values for wy near w,/4. 
(ii) Computing the corresponding values of u from equation (104). 
(iii) Choosing values for C, D, and # such that equation (114) gave 
a good fit to the data computed in the first two steps. It turns out that 
the curve described by equation (114) is satisfactory if it fits the com- 
puted data exactly at u = 0, +, and 3. 


In this manner we obtained the following coefficient values: 


Class II weighting, M = 7, R = 1000; 
C = 1.96339, D = 1.01643, EF = 0.893534. 


Class III weighting, M = 2;C = 2.56919, 
D = 1.53874, FE = 1.06345. For M = 3; 
C = 3.6020, D = 2.5862, EF = 1.0317. 


Using an approximation similar to equation (71), but extended to 
include weighting functions described by equation (38), we obtain an 
estimate for B, 


B = 27a, 





N sin fru)| 2 i Se | | oe) 


x! —- nN 


Method 3 can be used with weighting functions specified only in 
terms of H*(w) as well as those given in terms of H(w). 


VI. COMPARISON OF ACCURACIES 


In the preceding paragraphs we have derived several formulas that 
produce estimates of w) and B from A, values (estimators). We now 
turn to a comparison of these formulas on the basis of accuracy. The 
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accuracies we will compare do not include any possible computation or 
nonlinearity errors or other accuracy limitations that may be present 
in a DFT analysis system. Our accuracy comparisons include only the 
effects of leakage. 

The estimators used in the simulations were generated by using the 
function A(w), described by equation (63), in equation (21). This is 
equivalent to applying the weighting by multiplication in the time do- 
main or by convolution in the frequency domain. The use of equation 
(63) in simulations greatly reduces computation time. All of the simula- 
tions used N = 512 andf, = N/T = 7040 Hz. 

All of the estimation methods presented above use approximations. 
In this section we shall demonstrate just how good the approximations 
are. 

Consider the case where a tone of frequency f, , angle @) , and ampli- 
tude B, is being measured while another tone, at frequency f,, angle 
6, , and level B, , is also being received. The presence of f,; will affect the 
accuracy of any estimate one makes of fy or By (due to leakage). The 
size of the errors in the estimates of fy and By will depend upon which 
formula (method) is used and upon the values of fy , Bo, %, fi, Bi, 
and 6,. The combination of parameters that causes one method to 
give the worst estimates will, in general, not be the combination that 
causes another method to be at its worst. Thus it is difficult to compare 
methods. 

We have compared the three methods on the basis of the worst 
estimates each will make when 6) and fy are confined to a specified 
range of values (for example, 990 S f,) S$ 1003.75 Hz andO S @ S 360 
degrees) and B,, B,, f,, and @, are fixed (for example, By) = B, = 1 
and 6, = 0 degrees). 

Notice that if f; is equal to some multiple of 1/7 then its Ax will be 
very small except for K near Tf, . Thus such an f,; cannot cause much 
error in any of the three methods of estimation of fo, which use Ax 
values. For this reason we have fixed f, at a value that is an odd multiple 
of 37. 

Tables V and VI illustrate how inaccurate frequency and level esti- 


TaBLE V—Pooresst ESTIMATES WITH INTERFERENCE SEPARATION IN 
THE RANGE 55 + 6.88 Hz, No Spectan WEIGHTING 


Method Frequency Error, Hz Magnitude Error, dB 
1t 6.82 .596 
2 7.37 . 580 


+ Method 1 using only six estimators, those from 1 — 2a tol + 3a. 
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TaBLE VI—PooreEst ESTIMATES WITH INTERFERENCE SEPARATION IN 
THE RANGE 178.96 + 6.88 Hz, No SpectaL WEIGHTING 


Method Frequency Error, Hz Magnitude Error, dB 
1t 3.08 .0628 
2 3.75 .165 


+ Method 1 using only six estimators. 


mates are when no special weighting (JJ = 0) is used. The interfering 
tones corresponding to the simulations described in Tables V and VI 
were located at 1051.88 and 1175.63 Hz respectively. The frequency 
and magnitude error entries in these, and subsequent, tables give 
absolute values only. 

The simulation results presented in Tables VII and VIII indicate that 
substantial improvement in the accuracy of frequency and magnitude 
estimates can be achieved when weighting is used. The data in Table 
VIT shows that when the two tones are separated by a “small” frequency 
difference accurate frequency and level estimates can be obtained by 
using Method 8 with Class II or Class III weighting. Table VIII in- 
dicates that as the frequency separation increases Method 2 with Class I 
weighting is better. The accuracy of estimates made on closely spaced 
tones can, of course, always be improved by increasing N and T while 
keeping the ratio N/T constant. 

It is interesting to note from Figs. 3 and 8 that, fora given value of 


TasBLE VIJ—Poorest EsStiMatTEs with INTERFERENCE SEPARATION IN 
THE RANGE 55 -+ 6.88 Hz 


Class I Weighting 


Method M Frequency Error, Hz Magnitude Error, dB 
1 1 .89 ll 
1 2 3.32 AT 
1 3 6.16 .96 
2 i .513 .12 
2 2 .104 11 
2 3 .409 5 
Class II Weighting, R = 1000 
Method M Frequency Error, Hz Magnitude Error, dB 
1 7 1.14 .120 
3 7 .0651 9 .35E-3 
Class ITI Weighting 
Method M Frequency Error, Hz Magnitude Error, dB 
1 2 2.11 225 
1 3 5.12 821 
3 2 .034 .026 
3 3 . 149 .0655 
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Taste VIJI—Poorest EstimMaTes with INTERERENCE SEPARATION IN 
THE RANGE 178.96 + 6.88 Hz 
Class I Weighting 


Method M Frequency Error, Hz Magnitude Error, dB 
1 1 1.67E-3 2.79H-4 
1 2 3.98EH-4. 5 .25E-5 
1 3 4.55 -2 4.70E-3 
Z 1 5.73E-3 1 .42H-3 
2 2 1.47E-4 2.19E-5 
2 3 2.13E-5 1.55E-6 
Class II Weighting, R = 1000 
Method M Frequency Error, Hz Magnitude Error, dB 
1 7 5.56E-3 2.05E-5 
3 7 8 .09H-2 1.07E-2 
Class III Weighting 
Method M Frequency Error, Hz Magnitude Error, dB 
1 2 9.10E-5 1.19E-5 
1 3 1 .85E-2 1.90E-3 
3 2 3.35E-3 3.24E-3 
3 3 5 .94H-4 9 .62E-6 


M, there is not a great deal of difference between the weighting con- 
tributed by Hy,(w) or Gyx,(w). However, from Table VII it is obvious that 
the use of G2(w), when the tones are close together, will yield much more 
accurate estimates than Class I weighting. 

In equation (101) the “pointer”, a, was defined. The value of a@ 1s 
used by the system to determine whether the frequency being measured 
is above or below the frequency of the largest estimator, A,. Our 
simulation studies showed that under certain circumstances a, as 
calculated by equation (101) will point in the wrong direction. In 
general this happens when the contributions to A;_; and A;., due to 
the interference is equal to or greater than the difference in the con- 
tributions to the same estimators due to the tone being measured. Thus 
if | A,_1| | A741 |, a small difference in their magnitudes can change 
a. In our simulations this effect only caused trouble when f, and f, were 
separated by less than half the width of the main lobe of the weighting 
function, H(w). 

Since we have fixed B, = B,; = 1 in the simulations we have ignored 
the adverse effects of “large” level differences on the accuracies of the 
various methods. Leakage from an interfering tone with a high level, 
relative to the tone of interest, would certainly tend to reduce the 
accuracy provided by any of the three methods, no matter which weight- 
ing is used. 
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VII. CONCLUSIONS 


The discrete-tone measurement system we have been discussing is 
particularly well suited to systems that involve computer-controlled test- 
ing or measurement, provided the real time needed for the computations 
is available. Two advantages are: 


(i) The only interface hardware is the A-D converter (with its 
lowpass filter). 

(ii) The system is capable of measuring many received frequencies 
and levels during the same computation time. 

For a given number of samples taken at a given sampling rate, the 
accuracy of the system can be significantly improved through the use 
of some type of weighting function. We have examined three classes of 
such smoothing functions and have developed formulas which permit 
the extraction of received signal frequencies and levels from the DFT 
coefficients. The results of system simulations, presented in Tables V 
through VIII, show that the inherent accuracy of the described system 
can, through the proper use of weighting functions and estimation 
methods, be made satisfactory for many applications. 

With Method 2 considered to be a special case of Method 3, the tables 
show that the best estimation method, for all of the weighting functions 
examined, is Method 38. 

The tables also show that there is no “‘best”’ weighting function. The 
weighting to be used for any particular application should be selected 
only after a consideration of the expected tone frequencies, the relative 
levels, the measurement accuracy desired, and the desired value for N. 
The sampling frequency, N/T’, should be more than twice the highest 
frequency to be measured. 

It is interesting to observe that the Taylor (Class II) weighting func- 
tion used in the simulation is, for the situations simulated, not signifi- 
cantly better than the Class III weighting, G,(t), which is essentially 
that proposed by Blackman.’ There may be other situations, however, 
when the nearly optimum main-lobe width of the Taylor functions is 
useful. 

If the system could tolerate the relatively large amount of computer 
memory required, then the discrete Dolph-Chebychev functions de- 
scribed by equation (27) could provide some advantages. 
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Reed-Contact Switch Series 
for the LF. Band 


By M. B. PURVIS and R. W. KORDOS 
(Manuscript received October 17, 1968) 


A series of switches using a miniature dry-reed sealed contact in a cable 
switch configuration has been developed to provide switching capability 
from dc to 100 MHz. We present a description of the development, per- 
formance characteristics, and mechanical design features. 


I. INTRODUCTION 


The nationwide network of transmission facilities is not only growing 
in number of routes and capacity but also in terms of service capability 
and administrative flexibility. Within the network there are usually 
alternate routes for providing service between two points. Intercon- 
nection between points may ultimately be controlled by a remote, 
centralized, real time machine that contains an accurate map of the 
state of the network. 

The broadband restoration system, for example, can detect failures, 
make routine maintenance checks and report to a regional control 
center where an alternate route between the two points is selected. The 
control center then remotely operates the wideband switch at each 
junction of the route to effect a restoration of service. 

One component group needed to implement these systems is a family 
of wideband switches capable of meeting the transmission requirements 
of low insertion loss, high isolation loss, high crosstalk loss, and having 
an impedance well matched to the 75-ohm system impedance. 

The 266B (8 X 8 matrix), 274A (1 X 8), and 273B (1 X 2) switches 
have been developed to meet these requirements with low operate power, 
small size, and moderate cost. All of these codes use 237-type miniature 
dry-reed sealed contacts in a cable-switch’ arrangement to provide an 
extremely high isolation loss in the open state and a low insertion loss 
and good impedance match in the closed state. Appropriate matrix 
configurations are achieved by interconnecting the cable switches with 
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stripline networks designed to provide good system performance from 
de to 100 MHz. The requirements, performance characteristics, me- 
chanical design features, and a description of the development of 
these new wideband switching matrices are presented in this paper. 


II. REQUIREMENTS 


The restoration transmission requirements for an 8 X 8 matrix of 
64 crosspoints used to interconnect 75-ohm coaxial transmission paths 
over the frequency range of de to 100 MHz are: 


Insertion loss 0.6-dB maximum 
(closed-contact loss) 

Isolation loss 95-dB minimum 
(open-contact loss) 

Crosstalk loss | 95-dB minimum 
Return loss 28-dB minimum 


Transmission requirements for the 1 X 2 matrix and the 1 X 8 matrix 
are identical to those of the 8 X 8 matrix except for the crosstalk loss 
requirement, which is not pertinent in single-level matrices. Where more 
than one switch is enclosed in the same housing, the crosstalk require- 
ment will apply between switches. 

Speed of switching is not a stringent requirement, and operation in 
the millisecond range is satisfactory. Compact matrix size and moderate 
manufacturing cost are additional features required for practical ap- 
plication in the restoration switching systems. 

The switch arrays, 8 X 8,1 X 8, and 1 X 2 are illustrated schemat- 
ically in Fig. 1. Photographs of the three switch types are shown in 
Fig. 2. The major elements of the design in a transmission sense are the 
coaxial crosspoint developed from the cable switch, the input/output 
circuit boards, and the coaxial jacks. From the schematic diagram, one 
can see that in the 8 X 8 and 1 X 8 designs the closure of any crosspoint 
leaves seven open crosspoints connected by stubs on each associated 
circuit board or ‘‘tree.”” Because of the length of these stubs, the struc- 
tural considerations become important design parameters that seriously 
affect performance. The design considerations in each of these elements 
are presented in the sections that follow. 


III. CABLE SWITCH THEORY 


An extremely high isolation-loss requirement in the megahertz range 
normally precludes the use of conventional electromechanical switches, 
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Fig. 1 — Array schematic: (a) 8 x 8, (b) 1 x 8, (c) 1 x 2. 


such as the wire spring and flat spring relay, the crossbar switch, and 
the ferreed, because of their generally large open-contact capacitance of 
between 0.1 and 1.0 pF. However, an arrangement of two or more 
conventional switching elements connected in series by a length(s) 
of low-loss transmission cable is particularly well suited for operation in 
broadband switching applications where extremely high isolation is 
required. This broadband switching crosspoint is called the cable switch. 

Applying conventional lumped constant analysis to a string of open 
switches (that is, serially connected switches with substantially zero 
transmission paths between them) produces the following conventional 
and well known voltage divider approximation expression: 
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| Vour/Vin | = oCR/K + 1 (1) 
when wR < 1 and where 


w = angular frequency, 

C = open switch capacitance, 
Kk + 1 = number of switches, and 

& = the load impedance. 





Fig. 2— Photograph of switch arrays: (a) 8 x 8, (b) 1 x 8, (ec) 1 x 2 (4 
switches per package), (d) 1 X 2 (1 switch). 
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In other words, each switch added to the string reduces the ratio by 
adding one to the denominator. 

The above expression does not apply, however, when coaxial cables 
are connected between the switches. In particular, for (i + 1) switches 
with K pieces of identical lossless coaxial cables interconnecting them, 
the following expression applies (see Appendix): 














Vout _ wR 

5 7 cn. = [acre 
Ar + (A = wlZ, sin Gd) oS Ae eA. 

n=0 
J 

a wR 
»2 Any. 
n=0 


for practical components where the values of A, are given in the following 
table: 


n A, 

0 1 

1 A 

2 2A’—1 


° O&O 


4A° — 3A 


K 2ZAArR- a Ax-. 


and A = cos 6d + sin B d/2wCZ, for lossless lines where: 8 = w(ezurz/c)? 
(phase constant). 


én = relative dielectric constant of coaxial cable, 
Lr = relative permeability constant of coaxial cable, 
c = velocity of light, 
d = distance between contacts from switch to switch, 


Z,) = characteristic impedance of coaxial cable in ohms, and the 
remaining symbols have the same meanings as in equation (1). 


When the length of transmission line, d, between two switching ele- 
ments equals one-quarter wavelength, equation (2) indicates that the 
isolation loss in dB of the overall switch is twice the isolation loss of the 
individual switching element. However, a plot of equation (2) for the 
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ISOLATION LOSS CABLE SWITCH (dB) 
ISOLATION LOSS SINGLE CONTACT (dB) 





0.005 0.02 0.04 0.06 0.08 O10 O12 O14 O16 O18 O20 0.22 0.24 0.26 
LENGTH OF TRANSMISSION CABLE IN WAVELENGTHS 


Fig. 3— Isolation loss improvement for the cable switch. Multiplying factor 
for the isolation loss in dB of a two-element cable switch as a function of cable 
length between switching elements. (Zo = FR). 
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Fig. 4 — Equivalent circuit for the cable switch: (a) open, (b) closed. 
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specific case of two switching elements (Fig. 3) shows that greater than 
one and one-half times the isolation loss in dB of a single switching ele- 
ment can generally be realized with a length of cable of only 0.005 
wavelength. In addition, for short lengths of transmission cable the 
increase in isolation loss of the cable switch over that of a single switch- 
ing element is relatively independent of frequency. This results in an 
extremely broad frequency bandwidth of operation. 

As equation (2) indicates, a further increase in isolation loss can be 
obtained by adding more cable sections and switching elements to the 
structure. This, of course, increases the insertion loss as well as the 
physical length of the cable switch. Alternately, a choice of Z, less than 
the system impedance will result in a further increase in isolation loss. 
However, in practice Z, is chosen equal to the system impedance in 
order to avoid an impedance mismatch between the switch input and 
the termination when the switching elements are closed. 

Figure 4 shows a schematic representation of the open and closed 





Fig. 5.— Crosspoint elements and assembly. 
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cable switch in which each switching element is approximated by a small 
closed-contact resistance. For the closed state, the insertion loss of the 
cable switch is about equal to the combined loss of the individual switch- 
ing elements since the loss of the short transmission cable is negligible, 
and the impedance of the cable is generally equal to the system imped- 
ance. This factor of two in the insertion loss for a two-element cable 
switch is the only penalty in achieving the marked improvement in 
isolation loss desired. 

The cable switch concept reduces crosstalk between various signal 
paths in the switching matrix because whenever the switch structure is 
effectively shielded, crosstalk will be defined in terms of the isolation 
loss of the individual crosspoints. 


IV. COAXTAL CROSSPOINT DESIGN 


The crosspoint assembly and its piece parts are shown in Fig. 5. A 
cross-sectional view of the assembly is shown in Fig. 6. As can be seen, 
the crosspoint consists of three miniature dry-reed sealed contacts 
separated by short lengths of coaxial line. Three contacts are used 
since the isolation loss of a single contact is approximately 45 dB 
(w0R = 107°) at 100 MHz in a 75-ohm system. The contact gap capaci- 
tance is approximately 0.2 pF. 

The outer conductor of the crosspoint is a copper tube of 0.210 inch 
O.D. with a length of 7.82 inches. The tube allows free movement of the 
contact assembly thereby minimizing forces on the contact leads and 
preventing rupture of the contact seals. The tube wall 1s 0.005 inches to 
provide adequate structural strength for the assembly. Since the signal 
penetration of the tube wall is only of the order of 300 microinches at 
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Fig. 6 — Cross-section of crosspoint assembly. 
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100 MHz, the integrity of the coaxial transmission line is preserved, 
and there is no crosstalk coupling to the control windings. 

The tube length provides approximately three inches between the 
contact make points with two inches of coaxial line between the glass 
bottles of the contacts. Since this line segment is loaded with a molded 
polypropylene sleeve whose dielectric constant is 2.3, the equivalent 
electrical length is about three inches or 0.03 wavelength at 100 MHz. 
Equation (2) indicates that the open circuit isolation loss in dB of this 
three-switch crosspoint approaches 2.3 times that of a single contact. 

The diameter of the center conductor and the dielectric material 
between the contacts proved to be one of the most easily changed vari- 
ables, and a wide range of diameters and materials were evaluated. The 
diameter of the copper center conductor that gave the best return loss 
in the 8 X 8 matrix was found to be 0.010 inches (Z, = 120 ohms). The 
higher impedance (with respect to 75 ohms) of the center conductor 
section is required to offset the capacitance of the tree networks as 
discussed in Section 5.1. 

The yield strength of the annealed copper conductor between contacts 
is reached with 1.25-lbs force so that any stresses applied to the cross- 
point will be absorbed by the center conductor thereby protecting the 
contact seal. 

In initial switch models, the center conductor was wrapped around 





Fig. 7—Crosspoint insertion loss: (a) with 237B contacts, (b) with 237G 
contacts. 
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the contact leads and soldered. Although performance was adequate, 
experience has indicated the use of the connecting sleeve, Figs. 5 and 
6, between the contact and the center conductor facilitates manufacture. 

A dielectric end plug fits over the contact lead and into the copper 
tube. This plug supports the lead during the assembly process. A heat 
shrinkable polyethelene tube is shrunk over the entire assembly to 
provide mechanical stability in manufacture. 

The insertion loss of the 237B contacts was found to be of the order 
of 0.15 dB at 100 MHz. This high loss characteristic occurs because 
only a relatively short length of the contact blade is plated with gold 
and silver. To provide a continuous surface conductor along the blade 
with a better conductivity than the nickel-iron blade alloy, a barrel 
plating process was developed by the Western Electric Company, 
Allentown Works. These barrel-plated blades are assembled in a re- 
cently coded contact, the 237G. The leads in this contact are solder 
dipped for 0.1 inch to provide easy assembly and a good bond with the 
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Fig. 8— Magnetic flux distribution along the axis of a crosspoint: (a) oper- 
ated, (b) interference from diagonally operated crosspoints in an 8 X 8 array. 
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sleeve. The insertion loss characteristics of the cable switch when made 


from the standard 237B contact and the new 237G contact are con- 
trasted in Fig. 7. 


Kach contact is driven by a 300 turn, 5 layer, 32 gage coil. The coils 
of a given crosspoint are series connected giving an overall resistance of 





Fig. 9 — Tree networks: (a) 8 X 8, (b) 1 & 8, (c) 1 & 2. 
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about 11 ohms. Figure 8a shows the flux distribution along the axis of 
an operated crosspoint. Figure 8b shows the interfering flux from diag- 
onal crosspoints in the 8 X 8 array. Current reversal in alternate columns 
of the array provides an interfering flux cancellation so the resulting 
flux level is well below the contact operate levels when driven at the 
design level of 4.5 + 1 volt. 


V. CROSSPOINT INTERCONNECTION 


5.1 Use of Tree Networks 


Interconnecting crosspoints in a matrix arrangement without intro- 
ducing serious impedance mismatch is a formidable problem, especially 
when the matrix is required to switch signals above 10 MHz. Since 
only one crosspoint in any row or column on the matrix can be closed 
and terminated at a given time, each closure can be represented by a 
continuous transmission path along which a number of open crosspoints 
are attached at various intervals. These open crosspoints are essentially 
open-circuit stubs which can easily degrade the transmission perform- 
ance of the switch by causing severe impedance mismatches. Moreover, 
any general matrix construction in which switching elements are bussed 
together in rows and columns can result in different lengths of open- 
circuit stubs depending on which crosspoint is closed. 

In an effort to make all transmission paths through the switching 
matrix appear alike electrically, tree networks, Fig. 1, are used to con- 
nect groups of crosspoints to the various input and output connectors 
of the matrix. Through use of tree networks the rows and columns of 
the matrix are formed so that no long open-circuited stubs exist. How- 
ever, short stubs still exist at the positions where the open crosspoints 
are connected to the closed transmission path. Open-circuit stubs 
are equivalent to small capacitors short circuiting the closed transmission 
path and, unless carefully designed, usually preclude meeting the return 
loss requirement of 28 dB at frequencies above 50 MHz. 

The tree networks for the 8 X 8, the 1 X 8 and the 1 X 2 matrices 
are shown in Fig. 9. 


5.2 Tree Network for 8 X 8 Matrix 


The equivalent circuit of a single closed transmission path through 
an 8 X 8 matrix is shown in Fig. 10. As can be seen, the circuit is sym- 
metrical from end to end, making the electrical characteristics identical 
for both directions of transmission through the matrix. 

The open-circuit stubs.represented by short capacitors C, , C., and 
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Fig. 10 —- Equivalent circuit of a closed connection through an 8 X 8 matrix: 
(a) untuned circuit (R = contact resistance of the 237G contact; C1, Co, Cs = 
equivalent capacitance of open-circuited stubs; h, b, ls, 4, ls = 75Q transmission- 
line lengths), (b) low-frequency approximation of the tuned circuit. (Rsx = 
shunt resistance to adjust real part of input impedance, fh, Lz, Ls, In, Ls = 
equivalent inductance of high-impedance transmission lines.) 


C; in Fig. 10a can be effectively tuned out by 


(z) adjusting the lengths of J, , l,, l; and 1, and increasing their 
characteristic impedance to a value higher than 75 ohms, and 

(it) increasing the impedance of /; , the two inch coaxial line between 
the 237G contacts, above 75 ohms. 


The low-frequency equivalent circuit of this tuned network is shown in 
Fig. 10b, where the high-impedance transmission lines are approximated 
by series inductors. 

While this analysis provides physical insight into the factors affecting 
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Fig. 11 — Cross-section of tree network for an 8 & 8 matrix. 
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the design, it cannot provide a solution in closed form to the design 
problem. The organization of the 8 X 8 matrix was dictated by such 
factors aS minimization of the stub lengths by minimizing the distance 
between crosspoints, symmetry in the array, interfering flux from ad- 
jacent crosspoint coils, and the availability of suitable materials for 
design. 

The 0.375-inch dimension between crosspoints is a reasonable lower 
limit for mechanical assembly of the crosspoint array since the O.D. 
of the driving coils is approximately 0.31 inches. Circuit board material 
standard thickness is 53; inch so that the buildup of the strip line tree 
circuits is readily accomplished. Flux interference was not a problem 
as seen in Fig. 8. The question became, then, whether utilizing the 
insights given by the above circuit analysis, the system design re- 
quirements could be met for the adopted physical configurations. 

Initial efforts on the tree structure followed conventional stripline 
technology in which a planar center conductor is positioned between 
parallel ground planes by dielectric layers. Dielectric materials with 
a low loss and a uniform dielectric constant over the frequency band 
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Fig. 12— Equivalent circuit of a closed connection through a 1 & 8 matrix: 
(a) 1-to-8 direction, (b) 8-to-1 direction 
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include the polyolefins and polyphenylene oxides. These are widely 
used for microwave (300 MHz to 300 GHz) printed circuits. However, 
the frequency band of concern is well below that for which these more 
expensive materials are designed, and their generally poorer mechanical 
stability and peel strength suggested an alternate approach be taken. 

Epoxy glass has relatively better mechanical properties and peel 
strength and is more economical than the above materials. It also has 
considerably higher loss and dielectric constant. 

The compromise solution, Fig. 11, is the use of 3%-inch-thick epoxy 
glass board with one-ounce copper and an air space above the circuit 
to lower the effective overall dielectric constant. The impedance levels 
and capacitance of the various branches of the tree were adjusted 
experimentally with the pattern shown in Fig. 9 resulting. 

A shunt resistance, as seen in Fig. 10b, was added at the input and 
output of the circuit to compensate for the series resistance of the tree 
and crosspoint. The resistor used, Fig. 9, is the 257A type ceramic with 
evaporated tantalum nitride film element. It is mounted directly on the 
board by its leads. 

The ground plane spring is formed from a single stamped part of 
five thousandth-inch beryllium copper over-plated with fifty millionths 
of hard gold for corrosion resistance and sealing to the end plates and 
side rails of the switch assembly. The thickness of the material guaran- 
tees 2 minimum of 100 dB crosstalk loss through the circuit boards in 
the assembled switch. The spring’s rolled edges compress when slid 
into the side rail slots while the front slotted edge seats firmly to the 
face of the switch. The circuit board assembly is forcibly held in position 
by the bracket which seats against the rear slotted edge. The result is 
a well defined geometry insensitive to the temperature ranges to which 
the switch will be exposed and efficiently sealed against crosstalk. 
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Fig. 18 — Low-frequency approximation of a tuned path through a 1 X 8 matrix. 
(Ia, Ju, Is = equivalent inductance of high impedance transmission lines; 
Csx = equivalent lumped capacitance of open-circuited stubs; R = contact 
resistance of the 237G contact; Rsz = shunt resistance to adjust real part of 
input impedance.) 
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5.3 Tree Network for the 1 X 8 Matrix 


The tree network for the 1 X 8 matrix is shown in Fig. 9. The tree 
differs from that used in the 8 X 8 matrix because the 1 X 8 matrix 
is physically asymmetric from end to end. In such an array the tree 
is used to either connect one input to one of eight possible outputs (1- 
to-8) or to connect one of eight possible inputs to one output (8-to-1). 
Fig. 12a shows the equivalent circuit of a closed transmission path 
through the matrix in the 1-to-8 direction, and Fig. 12b shows the 
equivalent circuit for the 8-to-1 direction. The only possible way to 
approach a good input impedance match for this matrix for both direc- 
tions of operation is to: 


(2) minimize the lengths /, and J; so that C,, C,, and C; can be 
approximated by a single shunt capacitance, C'sx , 

(72) adjust the length of /, and the characteristic impedance of 
l,, lg, and l; such that the low-frequency equivalent series 
impedance of J, equals l, + 2l; , and 

(177) add a shunt resistance across C's, to compensate for the series 
resistance of the tree and the crosspoint. 


As can be seen from the low-frequency equivalent circuit of the resulting 
network, shown in Fig. 13, the input impedance is essentially the same 
when viewed from either end with the other end terminated in the 75- 
ohm system impedance. 

The circuit board for the 1 X 8 matrix is shown in Fig. 9. In order 
to achieve the higher stripline impedance, a thinner epoxy glass board 
is used, and the ground plane spacing is increased as shown in Fig. 14. 


5.4 Tree Networks for the 1 X 2 Matrix 


The tree network for the 1 X 2 matrix is also shown in Tig. 9. The 
analysis parallels that presented for the 1 X 8 matrix. The simpler 
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Fig. 14— Cross-section of tree network for 1 * 8 and 1 & 2 matrices. 
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nature of the matrix was confirmed by the relative ease of obtaining an 
experimental solution to the design problems. 


VI. CONNECTORS 


Standard 478A and 477B jacks accommodating type 728A coaxial 
cable are used for signal interface. The 478A jack flange was modified 
to allow closer placement of the input jacks on the 8 X 8 array. 

The crosspoint control windings for each crosspoint are individually 
terminated in standard commerically available connectors allowing 
wide latitude for control circuit design. 


VII. STRUCTURE 


Since the structure carries the signal ground, the integrity of the 
structure at each crosspoint is vital to the maintenance of high return 
loss. A failure of a tube joint will bring the return loss for that crosspoint 
to 10 dB even though all other joints are structurally sound. The struc- 
ture must also be tight to prevent crosstalk. Gaps at the flanges of the 
connector, between the tree circuits and the tube face, or along the rail 
at the spring will quickly raise the crosstalk above the minimum re- 
quired limit of 95 dB below the signal level. 

The tube array in each switch code is fixed to the base on one end and 
pinned on the other to provide axial freedom for thermal expansion. 
This degree of freedom is sufficient to protect the solder joints between 
the tubes and the end plates and those between the contact assembly 
and the circuit boards. 

The switch assembly has been vibration tested over the range of 5 
to 500 Hz. Resonant points were found for the structure in early models 
and modifications were made to provide a stiffer structure at those 
frequencies. Tests of the switch models shown in Fig. 2 led to the in- 
clusion of shipping blocks in the 8 X 8 switch to provide damping of 
the pinned end of the tube assembly during transport. The switches 
otherwise withstand anticipated shock and vibration in normal shipping 
and installation. | 

The switch has been cycled over the temperature range of 40°F to 
140°F at relative humidities to saturation without incident. 


VIII. PERFORMANCE 


Switch performance has been measured at three Bell Laboratories 
locations: Merrimack Valley at North Andover, Massachusetts; 
Holmdel, New Jersey; and Columbus, Ohio. The switch has also been 
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Fig. 15— Return loss for a single crosspoint in an 8 X 8 array: (a) 8250 Q 
resistor, (b) 2740 Q resistor. | 
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Fig. 16— Return loss for the 8 x 8 array from de to 100 MHz: (a) with 
8200-ohm shunt resistance, (b) with 2740-ohm shunt resistance. 
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measured at the Western Electric Works in Kansas City where produc- 
tion has been scheduled. 

Return loss results on a given piece of apparatus are reproducible 
to within 1 dB over the range from 28 to 48 dB. Insertion loss is repro- 
ducible to 0.02 dB. 75-ohm attenuators are used in bridge circuits as 
calibration standards for the measurements program. 


8.1 Return Loss 


The return loss for a single crosspoint of an 8 X 8 array is shown in 
Iigure 15. The presentation is in the form of a Smith chart. The trace 
is swept from 100 kHz to 100 MHz with the output of the switch ter- 
minated in 75 ohms. Two curves are obtained for each crosspoint by 
testing each end as the input. Since the physical structure 1s mechanic- 
ally symmetric, performance should be the same. The electrical asym- 
metry observed is the result of mechanical tolerances of switch elements 
with respect to the tuned circuit performance when seen from opposite 
ends. 

The return loss characteristic for two 8 X 8 arrays is shown in Figure 
16. The envelope of performance of all sixty-four crosspoints swept 





RESISTANCE COMPONENT 
R/Zo 





Fig. 17 — Return loss for the 1 X 8 array from de to 100 MHz: (a) 1-to-8 
direction, (b) 8-to-1 direction. | 
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from 100 kHz to 100 MHz from both ends, that is, 128 curves on each 
graph, is shown. The return loss is obviously better zat the lower fre- 
quencies where the geometric factors are a smaller fraction of a wave 
length. No one crosspoint defines the envelope of the plot for more than 
a fraction of the frequency swept. The scatter at the higher frequencies 
is a function of both switch geometry and mechanical tolerances. The 
return loss of the switch with 8200-ohm shunt resistance is seen to be 
better than 30 dB over most of the band. The improved performance at 
the higher frequencies obtained with the 2740-ohm resistance is at the 
expense of the performance at the lower frequencies. In either case the 
performance limit of 28 dB is met. 

The return loss for the 1 X 8 array is shown in Figure 17. The eight 
traces for either end as input indicates the design compromises required 
to meet the return loss objective. The switch could be optimized for 
either input at the expense of the return loss for the opposite end input. 

The return loss for the 1 X 2 switch is shown in Figure 18. No shunt 
resistor was used in this design since it readily meets the requirements 
for either end as input. 


RESISTANCE COMPONENT 
R/Zo 





Fig. 18— Return loss for the 1 xX 2 array from de to 100 MHz in 1-to-2 
direction and in 2-to-1 direction. 
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8.2 Insertion Loss 


The insertion loss for the 8 X 8 array is shown in Fig. 19. Only the 
two crosspoints are shown which represent the upper and lower limits 
of the insertion loss for the switch. Fig. 19b indicates the loss for the 
switch with shunt resistors of 8200 ohms. Fig. 19a indicates the loss for 
the switch with 2740-ohm shunt resistors. The higher insertion loss of 
the second switch is seen to be the penalty for the improved return loss 
as shown in Figs. 16a and 16b. 

Figures 20a and 20b indicate the insertion loss for the 1 X 8 and 
1 X 2 arrays respectively. 


8.3 Isolation and Crosstalk Loss 


These losses were found to exceed 105 dB across the band on all 
codes of switches. The crosstalk loss is at least as good as the isolation 
loss since the results of crosstalk loss are essentially the same as for 
isolation loss. There are no significant differences between near end, far 
end, terminated and unterminated crosstalk observations. 
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Fig. 19— Insertion loss for the 8 & 8 array from de to 100 MHz: (a) with 
2740-ohm shunt resistance, (b) with 8200-ohm shunt resistance. 
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Fig. 20—Insertion loss from de to 100 MHz: (a) 1 x 8 array, (b) 1 X 2 
array. 


IX. SUMMARY 


A series of switches designed to meet Bell System requirements across 
the de to 100 MHz band have been designed and models built. Tests 
show the performance characteristics over the band meet system re- 
quirements and that the physical structures are mechanically satis- 
factory for the environmental conditions anticipated in transport and 
installation. 


APPENDIX 
Derivation of the V 5ut/Vin Relationship for the Cable Switch | 


A.l Network Transfer Matrix 


Any network can be described in terms of an ABCD transfer matrix: 







I, Tp V, = input voltage 
I, = input current 
NETWORK V, = output voltage 
I, = output current 
id oe eal | : @) 
I, CG Dy ds 








where (4 3) is the ABCD transfer matrix. 
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A.2 Overall Matrix 


The overall ABCD matrix of a series of networks in tandem is equal 
to the matrix multiplication of the individual network ABCD matrices: 


NETWORK ~1 NETWORK ~—2 NETWORK- 3 


lout 





A B 
C D 














A, Bz} 
Co “ 


A, i] 4) 


: er Bs) 
overall C; D, C; D, 


A.3 Equivalent Matrix 





The equivalent circuit for one of K identical sections of the cable 
switch is given by: 


ae! ) aoe 
LOSSLESS 
TRANSMISSION 
2C CABLE 2c lege 
| 
| 
Vin | 
J 


y | | Y 


Je NETWORK~1 ~~ re NETWORK-2 — ole NETWORK~3 ->| 


Now, the ABCD matrices for the 3 networks in this circuit are: 


> 
ao 2 
4 


Networks-1 and 8 











be B, Hen Bel. i sisted (5) 
C, OD, C, Ds ae 2 
Network-2 
A» " o | cos Bd 42, 8in ¥ 6) 
Gs DD; jsin Bd/Z, cos Bd 





where 8 = 27/d 


4 = wavelength 
d = length of transmission line 
Zo = characteristic impedance of the transmission line. 


The ABCD matrix of one of the K identical sections can be obtained by 
matrix multiplication: 


REED-CONTACT SWITCH 253 


sin Bd cos Bd sin Bd __ 








bea . _ [08 BET 207, jw Hw2C)Ze 1 igen Be ~ 
Gu. Dia .sin Bd sin Bd 
Iv, cos BE + 507, 


Note: Ai, = Di,. 


AA Swiich Circuit 
The equivalent circuit for the entire cable switch is given by: 


2c Peay 








NETWORK-~IN NETWORK-OUT 


The ABCD matrices for the networks in this circuit are: 


Network-OUT 











Ae ~ _ 1 + 1/jw2CR _ (8) 
UCout Daout ‘ 1/R 1 
Network of K identical sections 
os Bey (Aw, Bale 0) 
Cx Dx Ci, Di 
Note: Ag = Dx. 
By matrix multiplication: 
n A, B,, CO 
0 1 B, (0) C,.(0) 
1 A, B,,(1) C.(1) 
2 2A = 1 B,(2A,;) C’,,(2A,.) 
3 4A, — 3A,, B,,(4A;, — 1) C,(4A7, — 1) 
K-1 


K. DAY Agee Aes. Bis Zs AG Ae. BOs. Py AX-1-"4, (10) 


Network-IN 
Ain Bis _ ; a) (11) 
Ge. D;n J 0 1 





254 THE BELL SYSTEM TECHNICAL JOURNAL, FEBRUARY 1970 


The overall ABCD matrix for the cable switch can be obtained by 
matrix multiplication: 














A B 
C D overall 
J Bx, Cx ( | Ax _ Ox 
Ac(I a 3) T R ar qa2C a: w2CR/ jo + Br am 
Ae 1 Cy 
| R + cx(1 T ail Ae j02C 
(12) 


A.68 Special Case: Iu, = 0 
For the case where J,,, = 0: 





Vise = 
Vin ee = 1/ Aoveral! . 
Therefore, for the cable switch: 
eer 1 





= es 13 
C, (13) 


1 Br ( 1 ) 
Ac(1 a sal TR T 706 \ + jo90R 
Since wR < 1 for practical values of C and R, 


c me jo Ch on (14) 
in : K 

Ar + 7B rw + iw4C 
Substituting the expressions for Bg and Cx in (10) and the expressions 
for B,, and C;, in (7) into (14) gives: 


a _ qoCh 


Vin 














we Ag+ (cos Bd + SUE ee CZ, sin ed) 2 An A 
K w2CZ, w 0! a ls n 
Vass CR. 
| ee | = ____#__ a 
. Ax + (A,, — wCZ, sin Bd) D5 AXA, 
n=0 
Since A,, >> w(Z, sin 8d for practical values of C and Z, , 
Vise R 
| Sie (16) 
ame > AA. 
n=0 
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A Lumped-Circuit Study 


of Basic Oscillator Behavior 


By N. D. KENYON 
(Manuscript received August 27, 1969) 


This paper presents an experimental study of the oscillations set up 
ma circuit consisting of a negative conductance and a mulitple-resonant 
load. Its purpose was to verify that such a circuit can account for many of 
the irregular phenomena commonly observed during tuning of practical 
microwave solid-state oscillators; such effects as discontinuous frequency 
changes, low circuit Q-factors, power variations, spurious oscillations and 
noise conditions are all readily reproduced in a simple low-frequency ana- 
logue. There ts close correspondence with a first-order analysis. 


I. INTRODUCTION 


In the course of routine locking-bandwidth measurements on 
IMPATT diodes in the 50-60 GHz range, some observations were made 
that could not be accounted for by the simple theory’~* of injection 
locked oscillators. In particular, locking ranges of about 100 MHz for 
—40 dB injected power were measured, indicating an extremely low 
effective circuit Q-factor (<5). In addition this range was asymmetrical 
about the free-running frequency, and was not exactly proportional to 
the injected voltage. On occasions there was at one end of the locking 
range a hysteresis between locking and unlocking conditions. 

Other phenomena commonly observed during tuning experiments on 
solid-state oscillators include the following: 


(z) A discontinuous change in frequency (here referred to as a ‘‘jump’’) 
as a parameter is varied (bias current, perhaps, or a tuning stub). 
If the tuning is reversed the jump occurs at a displaced frequency 
(“hysteresis’’). 
(72) In the neighborhood of a jump, the hitherto single line spectrum 
may acquire sidebands at displacements of order 0.1 to 1 percent. 
(zzz) Under some circuit conditions a broadband noisy output may be 
obtained. 


250 
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To explain these effects the passive circuit ‘“‘seen’’ from the diode 
terminals must be treated as more complex than the simple resonant 
circuit normally assumed. Kurokawa” has analyzed the case in which 
the active element is as simple as possible, but is connected to a passive 
load impedance of general form Z(w); it is found that most of the ob- 
servations can be accounted for by ascribing certain patterns to the 
locus Z(w). The conditions are summarized in Section II. 

Since the form of Z(w) for a packaged diode in a waveguide structure 
can be very complicated, is difficult to determine precisely, and more 
difficult still to design, a lumped-circuit approach at low frequency 
was used to verify the theoretical predictions. The frequency of 350 kHz 
was chosen so that measurements would be relatively unhindered by 
stray capacitance effects, but that spectra could be analyzed with 
sufficient resolution. 


II. SUMMARY OF ANALYTICAL PREDICTIONS 


When a negative conductance —G + jB is connected to a load 
G + jB, the voltage amplitude A and frequency w of the resulting 
equilibrium oscillation are determined by 


G(A) = G&) (la) 

—B(A) = Bio). (1b) 

These equations define the intersections in the complex impedance 
plane of the loci G(A) — jB(A) = —Y and GW) + jB(w) = Yo), 


these being referred to as the ‘‘device line” and “load line” respectively. 
For a single-tuned parallel-resonant circuit, Y(w) is a straight vertical 
line; for multiple-tuned circuits it may acquire bends and loops. From 
Kurokawa’s analysis of the latter situation, the following predictions 
emerge: 


(z) Let 6 be defined as in Fig. 1. Stable oscillation at the point P is 
only possible if 0 < @ < wz. Moreover, whenever the condition @ — 0 
obtains, the noise of the oscillations will greatly increase. 

(22) In Fig. 2, the line PT is the device line or, if that is not straight, 
PT is the tangent to the device line at P. The point T lies on a line drawn 
through the frequency points w) - Aw on the load line. If the resistive 
component G’ of PT satisfies 

1, 0G 


G’ — —sApo aA 


then spurious oscillations will grow at w) + Aw. If spurious oscillations 
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Fig. 1 — Definition of 6 at oscillation point. 


are not small, the values of w) , Ayo , and so on will be affected, and the 
first order condition above will not apply. 

(277) Figure 3 shows an injected signal of small amplitude a, at a 
frequency w, close to, but not coincident with, the intersection of device 
and load lines at w) . The perpendicular d is constructed from w, to the 
device line (which is here assumed straight); locking occurs if the length 
of d is not greater than a,/A, . There are additional requirements for 
stability of locked oscillation, the principal one being that the angle 8 
be less than 7/2. 

(zv) The relationship between injected current, oscillation amplitude 
and locking range for a near-horizontal device line (8B/dG ~ 0) is 
determined by | AB | = a,/A,, AB being the susceptance change from 
w@) to unlocking frequency. Thus if a and A, are held constant the ex- 





C—_ 


Fig. 2— Definition of G’ for spurious sidebands. 
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Fig. 3 — Injection locking conditions. 





tremities of the locking range will be those for which B(w) — Bw) = 
dp/Ao . 

If the device line is not horizontal, but has a small gradient g, then 
AB must be corrected by the factor (1 — g AG/AB). 

(v) The theory can be applied equally well to the case of a negative 
impedance —R(A) + 7X(A) simply by reading current for voltage and 
vice versa, and substituting R, X, Z, and so on, for G, B, Y. The circuit 
used must then of course have large impedance far from resonance, to 
inhibit undesired oscillation. 


Ill. ACTIVE ELEMENTS 


Both negative conductances and negative resistances were used, 
taking the circuit forms shown in Fig. 4. The negative conductance is 


Ry 


= (a fe - 48 = (b) 


Fig. 4-—- Negative admittance and impedance circuits. 
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readily seen to be 


ys —ph, am (R, + fh. + R3) 
(Rh, + R.)Rs 


where #3; is the amplifier output impedance and uy the voltage gain. If 
there is a phase shift (2n7 + ) across the amplifier, this equation con- 
tains a complex up: 


we = po(l + jy) 


which introduces an imaginary component B into Y. The amplitude 
A of oscillation depends on the saturation behavior of the amplifier 
u(A): as A grows, u(A) (and therefore G@) declines until the equilibrium 
condition (la) is established. At the same time the frequency settles 
such that the total circuit susceptance is zero (1b). 

The theory leading to the predictions of Section II assumes that the 
device characteristic Y(A) is independent of frequency. This is so here 
if w(A) is not frequency dependent and the circuit is free of parasitic 
capacitance. 

The negative impedance of Fig. 4(b) is 


i= —yR, + R;. 


By making R, low, Z is confined to a few hundred ohms. The behavior 
of Z(A) is very similar to Y(A) above, but the circuit has the disad- 
vantage that no point in the oscillating loop can be grounded; thus 
large errors arose when injection-locking characteristics were measured. 
We confine our attention here to the negative admittance circuit. 


IV. METHOD OF MEASUREMENT 


The amplifier used was a C-Cor 1319F transistor video amplifier, 
with a small signal gain of 40 dB into 50 ohms, and a bandwidth of 15 
MHz. The characteristic Y(A) for a given feedback resistor was estab- 
lished by direct measurement, not by calculation from yp, R,, R., RB. 

Figure 5 shows schematically the essentials of the experiment. Apart 
from the active Y and passive Y(w) networks, there is an oscilloscope, 
an injection signal source, a monitoring circuit for the frequency spec- 
trum, and a bridge for passive admittance measurements on Y(o). 

Monitoring equipment is hung on to a current pick-up lead, which 
has a negligible loading effect on the circuit. From a counter the fre- 
quency is known to 0.01 kHz, oscillations usually being stable to this 
degree. The wave analyzer determines to the same accuracy the fre- 
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Fig. 5 — Schematic of low-frequency experiment. 


quency of sidebands. Finally a convenient display of the spectrum is 
provided by up-converting into a spectrum analyzer. 

To make passive measurements the negative admittance was removed 
and the RF bridge connected by a short cable to the points shown. The 
wave analyzer was employed as a very sensitive null detector (to —90 
dB). 

The arrangement of Fig. 6 was convenient for giving an oscilloscope 
display of the admittance. 
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Fig. 6 — Admittance display circuit. 
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The oscilloscope beam is driven in a circular path of radius propor- 
tional to the current through Y(w). The spot is brightened by a 2 ns 
pulse at the instant of maximum voltage. The ac voltage amplitude is 
approximately constant. The circuit is obviously of limited bandwidth 
and was not used for quantitative measurements. 


V. MEASUREMENTS 


5.1 Characterization 


The device line —Y(A) was established from oscillations with a 
single-tuned parallel-resonant circuit (Fig. 7, inset). Since the imaginary 
component B was small, the frequency was close to the resonant fre- 
quency, and the latter was varied by changing L. The oscillation fre- 
quency w) and amplitude A were measured, and then the bridge was 
used to determine the admittance Y(w)): this quantity is plotted di- 
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Fig. 7 — Conductance saturation curve at 350 kHz. 
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rectly as the device line, since —Y(A) = Y(w,) at equilibrium. Figure 
7 gives the conductance-saturation characteristic at 350 kHz, these 
amplitude points being then superposed on the device line plot of Fig. 8. 
The variation of these characteristics over the range 310-390 kHz were 
found to be quite small. The device line is seen to be neither steep nor 
sharply curved: thus, the negative admittance is a good approximation 
to the idealized one assumed in the analysis.’ 

It may be worth noting at this point that tuning of the shunt in- 
ductance L is roughly equivalent to a complete vertical displacement of 
the locus Y(w) in the complex admittance plane. The device line in this 
experiment remains stationary, and the frequency of oscillation and 
amplitude vary according to the changing point of intersection. However 
similar phenomenological observations are to be expected if it is the 
device line that undergoes a vertical displacement—this commonly 
occurs for bias current changes of solid-state microwave sources. In 
succeeding paragraphs it is assumed that only the relative vertical 
relationship of device and load lines is significant, and that this can be 
changed at will by tuning of L. 


5.2 Double Resonant Circuit 


The addition of a series resonant circuit L, , C, , R across the parallel 
L,, Cy, provides a number of interesting situations (Figs. 9, 11, 18). 
The resonant frequency is the same for both circuits. 

Defining Q, = CR, Q2 = wL/R, and b = B/R, we have 


0b/dw |o, = 2C,R — 2Ln/R = (2/0 )(Qi — Qe). 


The susceptance of the series circuit tends to compensate that of the 
original parallel circuit in the neighborhood of resonance, and the com- 
bination becomes broader band than either of the two; the effective 
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' Fig. 8— Device line plot at 350 kHz. 
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Fig. 9 — Double-tuned circuit behaviour. 


Q at w is the difference Q, — Q2, which may obviously be made as low 
as desired. We shall consider in turn the three cases Q; > Q., Qi: = 


Q2, 41 < Qe. 


5.2.1 Case 1: Q: > Qe 


Figure 9 shows a typical measured Y(w) locus for this case. We 1m- 
pose the further condition that at no point is the device line steeper 
than this load line so that the requirement 0 < 6 < m (See z in Section 
II) is met. The measured amplitude at each frequency is also plotted 
in Fig. 9, and corresponds to the appropriate points of the device char- 
acterization, including the general decline with increasing frequency. 
By comparison with the single-resonant case the frequency points are 


264 THE BELL SYSTEM TECHNICAL JOURNAL, FEBRUARY 1970 


relatively crowded at the center of the locus, and there is considerable 
variation of G in this region. When this circuit is connected to the de- 
vice, the oscillations generated vary continuously in frequency and 
amplitude as L, is tuned, and there are no unstable points. 

The small-signal locking behavior of this kind of circuit was in- 
vestigated. A variable-frequency source of constant short-circuit current 
(5 mA peak) injected a driving signal into the terminals of Y (See Fig. 
6). Oscillation amplitude at 350 kHz was determined, together with 
the frequencies w, , w. at which the oscillator fell out of synchronism 
with the driver. Then the negative admittance was removed, and the 
change of susceptance 2AB of the passive circuit between w, and wz was 
accurately determined (by the addition of known capacitances across 
the terminals to maintain bridge balance). This procedure was repeated 
for various (Q; — Q.) by changing R. 

The results appearing in Table I show good consistency of AB with 
a/A, except for the last two cases. These have critically low Q and the 
locking phenomenon is affected by excessive noise and drifting. Figure 
10 shows susceptance and frequency ranges as a function of injected 
current level for a particular value of R and A. The AB — a relationship 
is linear, as expected, while the locking range falls off at higher levels. 
Over the linear region the voltage-gain X fractional-bandwidth product 
is constant, as for a single-tuned circuit, namely, 


g Xb = 2/(Q; — Q2). 


We have thus confirmed that the locking range of an oscillator for a 
specific circuit 1s w; — w., where 


| B(w;) — B(ws) | = 20/A. 


[For microwave circuits the right-hand side would be better expressed 


TABLE I 

2Af A a/A AB 

(kHz) (Volts) (mmho) (mmho) 

1.95 5.8 0.85 0.82 

2.6 5.3 0.94 0.96 

3.6 4.9 1.02 1.05 

5.5 4.7 1.08 1.08 

7.1 4.6 1.08 1.10 

9.7 4.5 1.12 1.06 

16.6 4.5 1.11 0.99 (Qi, = Q:) 
19.0 4.5 1.11 0.94 (Qi < Q:) 
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Fig. 10 — Locking range vs. driving signal in broadband circuit. 


as (4/R)(P;/P,)*, where P, is the output power and P; the available 
driver power.] Moreover the combination of two tuned circuits of 
similiar Q can give a very low effective Q and correspondingly large 
locking gain-bandwidth product. 


5.2.2 Case2:90,2Q. 


Under this condition a cusp appears on the admittance locus: Fig. 
11(a) compares this with the Q. = 0 case (—markers are at 5 kHz 





(bb) (2.5 kHz/cm) 





Fig. 11 — Circuit admittance for Qz = 0 and Qs = Q; cases. 
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intervals). Oscillations right at the point of the cusp are very noisy 


(Fig. 11b). When Q, 1s slightly greater than Q., the load line may be 
parallel to the device line (@ = 0) over a considerable band, and the ex- 
tremely noisy broadband output results (Fig. 12). 


5.2.3 Case 3:Q. > Q; 


A loop appears in the admittance locus (Fig. 18a). Part z in Section 
II states that if the device line has the slope indicated by the dotted 
line, the region between the points A and C (at which the device line 
would be tangent) has 0 > @ > —7 and is therefore unstable. As the 


~Y (A) 


Y (w) 3 





b) (skHz/cm 


Fig. 12 — Broadband noise conditions. 
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Fig. 13 — Admittance loop for Q2 > Qu. 
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circuit is tuned (by decreasing L,) the frequency moves through D to A, 
at which it jumps to B; tuning the other way, the frequency moves down 
to C, then jumps further down still to D. Fig. 13(b) shows the behavior 
of the amplitude. There are two regions of overlap DA, BC in which 
oscillation is entirely stable at either of two points, depending on the 
way in which this oscillation was set up. By accurate measurements of 
the slope of Y(w) at the jump frequencies, and the slope 0B/dG of the 
device line at the appropriate values of A and f, it was established that 
the jumps AB and CD occurred very close to the points of tangency of 
the device line to the Y(w) locus. 

If a large parallel conductance G, is added to Y(w), all oscillation 
ceases (Fig. 14). Reduction of G, is now equivalent to a leftward dis- 
placement of Y(w); at the dotted position oscillation of low amplitude 
is initiated. Clearly this oscillation can never begin at any point on the 
loop, except the cross-over point P. If a low-amplitude signal is initiated 
at P, it is possible for the spectrum to contain both frequencies, though 
usually one will predominate. 


5.3 Spurious Oscillations 


The condition of part 22 in Section II requires that the line joining 
points wy -- Aw should intersect the tangent to the device line through 
wo at a point to the left of A,. Considering Fig. 15(a) (shaded part 
unstable) we see that for the double-resonant circuit such intersections 
invariably occur to the right. In practice, noisy sidebands were only 
observed in such a circuit when the device-line had been given a sharp 
curvature. 

Figure 15(b) shows that the sideband condition will be fulfilled if 





G—- 


Fig. 14 — Signal initiation. 
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Fig. 15 — Spurious sideband conditions. 
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Fig. 16 — Sideband measurements. 
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the frequency points to the left of w) can be crowded together. This was 
achieved by the addition of another series resonant circuit of inter- 
mediate Q and tuned to about 330 kHz. It was then very easy to get 
spurious sidebands over an appreciable range of frequencies. A typical 
case is shown in Fig. 16. As Z, is reduced the frequency climbs to 343.5 
and jumps to 380; between 340 and 348.5 there are noise sidebands, 
beginning at 340 with very small amplitude, progressing through greater 
amplitudes and additional sidebands at w) + nAw; before the jump, 
the first sidebands were almost the same amplitude as the principal 
oscillation, and interstitial components at w) -- nAw/2 and so on also 
appeared (lig. 17). 

Accurate measurements were made in the case of very small side- 
bands, for which the small signal theory may be expected to apply. 
In Tig. 16 the admittance Y(w) is shown accurate to within 1 mmho 
(relatively) on each scale. The line through w, + Aw is thus subject 
to little uncertainty. The measured device-line slope at w. gives an 





b) 341.5 kHz (C) 343.5 kHz 


Fig. 17 — Oscillation spectra. 
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intersection at 7’ as shown, and PT & 33 mmho; the measured value of 
(—3A, 0G/dA) was 39 mmho, indicating that the intersection should 
have been more to the left. Other similar measurements gave discre- 
pancies of the same order and of both signs. Obviously only a small 
error in the amplitude or device slope measurements will move the 
calculated intersection considerably. Moreover it is not known how the 
weak function Y(w) alters the theoretical sideband condition. 


5.4 Further Observations 


Hysteresis between locking-in and unlocking frequencies was observed 
even with double-tuned circuits, usually where the load-line 1s most 
inclined to the vertical. The effect is ascribed to the second-order effect 
of amplitude variations, and is such that more power is required to 
pull-in an unlocked oscillator than to maintain locking at the same 
frequency. 

Low-frequency switching between two states was found for some 
critical adjustments of a triple-tuned circuit in which two loop-type 
resonances interfere. The time-constant was associated with that of 
build up and decay of oscillations in the circuit. 

The oscillator could be pulled by a small injected signal, when the 
frequencies were widely different but the circuit admittances similar, 
as by the cross-over of a loop. 

Where spurious sidebands were present, locking of the whole pattern 
occurred whenever the driver was close to any of the sidebands. 


VI. CONCLUSION 


Measurements on a low-frequency lumped circuit model have sub- 
stantiated earlier theory’ concerning the interaction of a negative ad- 
mittance and a multi-resonant circuit. The model described is not cap- 
able of simulating devices whose characteristics are more complicated 
than the simple form Y(A). For example, it is not easy to duplicate with 
lumped elements the observations which have been made on microwave 
oscillators, involving generation of the same frequency at two different 
diode bias currents, or utilizing second-harmonic tuning or sub-har- 
monic pumping. 

However the experiment shows that many of the complex phenomena 
associated with tuning of solid-state microwave oscillators may be 
reproduced under these simplified conditions, and that therefore it is 
very frequently the microwave circuit, rather than the device, which 
is at fault. In addition, the practicability of broad-banding circuits for 
deviator and locking-amplifier applications has been demonstrated. 
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Radiation Losses of Tapered Dielectric 
Slab Waveguides 


By DIETRICH MARCUSE 
(Manuscript received November 10, 1969) 


In this paper we calculate radiation losses of a single mode dielectric 
slab waveguide for TE and TM modes. The theory is based on the deter- 
mination of the radiation losses of one abrupt step. We obtain the losses 
of arbitrarily deformed waveguides by regarding the arbitrary deformations 
as a succession of infinitely many infinitesimal steps. This method yields 
the same results as a very different method presented earlier. It allows us 
to calculate the losses of TM modes that were hard to obtain by the earlier 
method. 

The radiation losses of single mode slab waveguides with abrupt steps of a 
2:1 ratio are surprisingly low and can be kept below 1 percent by dimen- 
stoning the guide properly. The loss advantage of linear tapers becomes 
noticeable only when the tapers are very long. An optimized taper changes 
more rapidly in its wider portion and becomes more gradual in tts narrow 
part. 


I. INTRODUCTION 


The study of radiation losses of dielectric waveguides, which has 
been described in three earlier papers, has been extended to cover 
abrupt steps in a single mode waveguide as well as continuous tapers. 
The mathematical theory of radiation losses caused by a step in the 
waveguide 1s used to compute the losses caused by tapers by regarding 
the taper as a succession of infinitely many infinitesimal steps. This 
method can also be used to rederive the equations for a dielectric slab 
waveguide with small wall distortions presented earlier." Both the 
earlier method and the derivation based on small steps lead to identical 
results. The perturbation theory used in Ref. 1 was not very well suited 
for calculating the losses of TM modes. The step method is equally ap- 
plicable to TM and TE modes and allows us to derive for TM modes 
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the corresponding expressions which for TE modes were presented in 
Ref. 1. 

The radiation losses of steps and tapers are surprisingly small. A 
step which changes the thickness of a dominant mode slab waveguide 
to one half of its original value causes a loss of only about 1 percent for 
TI modes and about 2 percent for TM modes if operated at favorable 
frequencies. The losses of tapers are even smaller and can be made as 
small as desired for sufficiently long tapers. 

Comparison of the radiation losses of slab waveguides with round and 
rectangular waveguides (to be published) shows that the slab wave- 
guide losses are exceptionally low. The losses caused by steps in circular 
waveguides are higher by an order of magnitude. 


WI. THE MODES OF THE SLAB WAVEGUIDE 


We state briefly the TE and TM modes of the dielectric slab wave- 
suide. For simplicity we assume that all the fields are independent of 
one spatial coordinate so that we can write symbolically 


0 

5 =o (1) 
Incidentally, it is only because we limit the discussion to cases where 
equation (1) applies that it is possible to speak of transverse electric 
(TE) and transverse magnetic (TM) modes. In the general case the 
modes are hybrids and possess longitudinal I as well as H components. 
The modes of the dielectric slab waveguide consist of a finite set of 
guided modes and a continuum of radiation modes. The slab geometry 
is shown in Fig. 1. 


2.1 TH Modes 


The field components #,, EH, and H, vanish. The remaining com- 
ponents of the magnetic field can be obtained from £E, 


ik, _ 6 


He rs ar dz “et (2) 
7 OH, 
* ap 02 (3) 


The dependence of the field component on the length coordinate z and 
on the time ¢ is given by 


por rn ee: (4) 


This factor will be omitted from the following equations. 
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Fig. 1 — Dielectric slab waveguide. 


2.1.1 Hven Guided Modes 


| 


(5) 


E, = A, cos xx aI oe 
E, = Ae™ cosxde7'*' le|2d 


The coefficient A, is related to the power P carried by the mode by the 
following equation 


A, = | 2eueP_|* 6) 
joa 
The relation between x, y and 1s given by 
Kk = [(nk)” — Bol’, (7a) 
y = (65 — bk}, (7b) 
k = wleouo)’. (8) 


nm is the index of refraction of the dielectric slab. The index of the sur- 
rounding medium is taken to be n = 1. The eigenvalue equation for the 
determination of 8» 1s 


tankd = *. (9) 


K 


A few numerical values for 8) are shown in Table I. The TE modes are 
power orthogonal. With the power flow P in z-direction (per unit length 
of y) we have 


Ps, = B= | E,,E*, dav. (10) 
WEL Jo 
2.1.2 Hven Radiation Modes 


EH, = B, cos ox Pe Sd). 
= Cen ae Ce | x | > 


(11) 
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TABLE I—SomE NUMERICAL VALUES OF By 


kd n TE Mode Bod TM Mode fod 
2.5 2.50271 2.50263 
5.0 5.01550 5.01519 
10.0 1.01 10 .06061 10 .06016 
20.0 20.16711 20. 16680 
0.25 0.25781 0.25207 
0.5 0.54916 0.51677 
1.0 1.432 1.21972 1.12809 
1.5 1.93825 1.84210 
2.0 2.66839 2.58934 
3.0 4.13075 4.08131 


Propagation constants of TE and TM modes 


(The asterisk indicates the complex conjugate value) with 


o = [(nk)’ — 6°}, (12) 
p= (kh — B’}', (13) 
C, = 3B, exp (—ipd( cos od + isin rd) (14) 


B= ee 
; mB(p cos’ od + o° sin” cd) 
The power orthogonality of the radiation modes can be expressed by 
the equation 


(15) 


Pea p< I E, (a, p)E*(x, p’) de. (16) 


P is the power flowing per unit length (in y-direction) in the z-direction. 

The odd TE modes have been listed in Ref. 1 (together with the even 
TE modes). Since we are limiting the discussion of TE modes to sym- 
metrical tapers excited by an even mode we will not need the odd TE 
modes in this paper. 


2.2 TM Modes 


With the restriction imposed by equation (1) the only nonvanishing 
components of the TM modes are H, , 





1 OH, 
fa we OZ ’ (17) 
Eo = 2 OH, (18) 


we Ox 
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We have no occasion to use the odd guided TM modes, therefore only 
the even guided modes will be listed. 


2.2.1 Hven Guided Modes 
H, = A, cos xx for |x|s ‘t (19) 
H, = A.e”’ cosxde™'*' for |x| 2d 
The amplitude constant is related to the power P carried by the mode 
4 
ee meen ae by (20) 
Bo 
| Btn ' * 
The constants x and 8 are related to 8) by equations (7a) and (7b). 


The eigenvalue 8, of the even guided TM modes is obtained as a solution 
of the eigenvalue equation 


tankd = n° (21) 
A few numerical values for 8) are shown in Table I. The power orthogo- 
nality of the guided TM modes can be expressed by 


Pbim = Bo f e H,,, *, dt = 2 | ely ,l*, AX. (22) 
GW dg € nO 
2.2.2 Hven Radiation Modes 
H, = B, cos ox lz{s 7 (23) 
H, = Ce"?! + C¥e**'*! oe eee 
with p and y given by equations (12) and (13) and with 
C, = 1B,(cos od + 7 sin od)o“* (24) 
The amplitude B, is given by 
) 2 
i a __ seen (25) 
|ns(n? cos’ od + = sin” od)| 
2.2.3 Odd Radiation Modes 
H, = By sin ox for |x| sd 
(26) 
H, = te iO? ae Cree for | a | > d 


| 2 | 
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with 
1 —ipdf : 2 oT 
Co = 5Boe (sin od — 2, © od) (27) 
p 


and 


By = pp, eh (28) 
| (n*o* p sin’ od sh a cos” rd)| 


The power orthogonality of the radiation modes is expressed as 


Pies ae © H,.(«, p)H%(x, p") de. (29) 


All the modes are orthogonal among each other. The amount of power 
P carried by each mode is normalized to the same value. The actual 
power carried by the field is determined by the expansion coefficients. 


III, TE MODE RADIATION LOSS 


Prior to discussing the radiation losses of a waveguide taper we 
calculate the losses of an abrupt step in the dielectric slab waveguide. 
We limit our investigation to the case that only the lowest order guided 
mode of each type exists. These modes do not experience a cut-off and 
can exist on waveguides with vanishingly small thickness. The steps 
are considered to be sufficiently small to keep the guide dimensions below 
the point where a second guided TE or TM mode becomes possible. 

The geometry of the step is shown in Fig. 2. The loss problem is 
solved by assuming that one guided (TE or TM) mode is incident on 
the step. The discontinuity in the waveguide causes a reflected mode 
as well as forward and backward traveling radiation modes to occur. 
The unknown amplitudes of these modes are determined by requiring 


REGION 1 x REGION 2 
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U LLL 


Fig. 2— Abrupt step in a dielectric slab waveguide. 


DIELECTRIC TAPER LOSS 279 


that the transverse field components are continuous at the step. For 
TE modes we get the following equations: 


E;? + 4,B,” + | q-(p) Ey” (p) dp 
; (30) 


= ¢,y” +/ q:(p) LE; (p) dp, 


HO? + aHO + | a()HO(o) do 
(31) 


ao ras Ae +/ q:(p)Hs"’(p) dp. 


The superscripts 2, r and ¢ indicate incident, reflected and transmitted 
waves. The field components whose p dependence is explicitly shown are 
radiation modes, the other field components belong to guided modes. 

There are two ways to compute the radiation losses. We can calculate 
the coefficients c, and a, of the transmitted and reflected guided mode 
and calculate the radiated power loss from 

AP 

pe eel 
or we can calculate the coefficients gq, and g, and obtain the radiation 
losses from 


(32) 





AP _ " 2B | ; 2 B 
=f tat taet fla Eas. (33) 


Both methods should, of course, lead to the same result. 
It is impossible to obtain exact solutions of equations (80) and (31); 
a comparison of both methods (82) and (88) allows an estimate of the 
validity of the approximations that are used to solve these equations. 
We obtain approximate solutions by the following argument. Since 
all modes of the same waveguide section are orthogonal we can use the 
orthogonality of the modes to isolate c, on the right hand side of equa- 
tions (80) and (81). We get for TE modes from (80) 
C, = 


(1 + a,) [ Boe ae (34a) 
= 55 


and from equation (31) 


ee [ (i) pus 
G; ep (1 — a,) ; Ey’ EH, dx. | (34b) 


280 THE BELL SYSTEM TECHNICAL JOURNAL, FEBRUARY 1970 


The coefficient g, was neglected. For large steps the radiation is 
scattered predominantly in forward direction so that q, is indeed small. 
If the step height is small the fields Z$” and E‘” (p) become more nearly 
orthogonal so that g, again does not contribute very much to equations 
(34a) and (34b). The propagation constant 6. belongs to the guided 
mode on the waveguide to the right of the step while 6, belongs to the 
guided mode to the left of the step. Because of the different waveguide 
size these propagation constants are not the same. 
Equations (34a) and (84b) allow the determination of c, and a, 


— 28:82 1p paps 
= By = Bo 
ee B, | Be a) 


The integral can be evaluated with the help of equation (5) so that we 
obtain 


A(n? — 1)8,B.k" cos kz do 


} 
| (6, dy i Bs) (¢, dz + bs) | (8B, or B2)°(B, = Bo) (Kj = 2) 
‘[y2 COS K, dz — x, Sin kK, de + (y¥1 — Ye2) COS K, de 77"), (37) 


Cc; = 


The determination of q, and gq; is not quite as simple. The functions 
ES (p) and ES (p) belong to different waveguides and are not orthog- 
onal. For large steps with predominantly forward scattering q, may 
again be negligible but this is certainly not true for small steps. We 
would need different approximations for large and small steps. To avoid 
this difficulty we consider only small steps and construct large steps 
and waveguide tapers as a succession of small steps. For infinitesimal 
steps the modes ES” and ES” are very nearly orthonormal and reflected 
guided modes can be neglected. Using the orthogonality of the modes 
we obtain 


qi(p) = (Bo + BL (38) 
and 
Qr(p) = 2(Bo — BL (39) 
with 
1 7 i t)* 
i= api E‘ ES (p) dx. (40) 


The expression J does not depend on the sign of 6, we therefore obtain 
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q-(p) from q,(p) by reversing the sign of the propagation constant 8 
of the radiation mode. We may drop the subscript r and ¢ and obtain 
after integration 


q(p) = —(@ — 1)k’ 
p cos xd cos od Ad . (41) 


(7)*(Bo — a) | B | (5, d + a cos’ od + o” sin’ od) | 
0 
The difference Ad = d. — ad; is assumed to be small. Because of the 
relation between q,(p) and q,(p) we can write equation (83) more simply 


* = J [ale) f 18.1 dp. (42) 


IV. APPLICATION TO TAPERS 


Equation (41) can immediately be extended to apply to symmetrical 
waveguide distortions of arbitrary shape. We assume that the shape of 
the waveguide wall is described by the function f(z) as shown in Fig. 3. 
We can then write 


ni Fat. (43) 
The amplitude q(p) was calculated for a small step at 2 = 0. Locating 
the step at z the guided wave arrives there with the phase e**** instead 
of with phase zero as assumed in equation (41). The radiation mode was 
also referred to z = 0. Referring it to a step at z adds the phase factor 
e’*? to equation (41) because the amplitude B of the radiation mode 
enters equation (41) with its complex conjugate value. A step at z 
would be described by an expression like (41) with an additional phase 
factor 
es, (44) 








Fig. 3— A symmetrical wall distortion (symmetrical taper) of a dielectric slab 
waveguide. 
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It must be assumed that 6, (but not 8) 1s a function of z if the guide 
thickness is changing. 

The total radiation loss of a section of waveguide (for example a 
taper) of length L is given by equation (42) with 


qe) = —@ — Ik 


2 p cos xd cos ode™*“P?~?) dj 
dz 
— ao ina ee ee 
| et) E |B | (A dr Ba) cos’ od + o° sin” od) | 
(45) 


Except for the restriction to symmetrical waveguides, equation (45) 
describes the same problem as treated in Ref. 1. In fact, we can obtain 
equation (57) of Ref. 1 by a partial integration. The formulation of 
Ref. 1 applies to the case that the thickness of the waveguide at z = 0 
and z = LI is very nearly the same. The function f(z) deviates so little 
from the half thickness d of the perfect waveguide that 6) , x and y can 
be assumed to be independent of d. With these assumptions, we obtain 
as a result of a partial integration 


(n? — 1)k’p cos xd cos ode(B) 


q(p) = ee (46) 
i| = | 8 | (6, d + Ba) cos’ od + o’ sin’ od) | 
with 
o(8) = | f@e*** ae, (47) 


The agreement with equation (57), Ref. 1, is perfect if we keep in mind 
that the functions describing the upper and lower side of the waveguide 
are now identical except for a minus sign and that the function f(z) — d 
of Ref. 1 is now redefined and replaced by f(z). 

The fact that equation (45) is identical to the theory of Ref. 1 proves 
the validity of our method of continuous steps. 


4.1 TM Mode Radiation Loss 


The radiation losses of the lowest order guided TM mode at a sym- 
metrical step in the dielectric slab waveguide can be calculated from 
equations (30) and (81) by changing the subscript x to y and y to z. 

The c, coefficient for the lowest order (dominant) even TM mode is 


(48) 
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and the a, coefficient is 








oes es 49) 
with 
| _@ — UB, COS Ko ds a Yada 
(8; — Bi)(e + 2) ; a.( = ae oe, a)(5s ak ty R| 
-{x,k? sin x, do — Yolk; + 83) cos x, de 
+ [ye(n?k? + 62 — Bi) — ny ker" cos x dy}, (50) 
and 
i= eI ae ag 
2 — BiG 2 aa : 
ahar 5 path \e pay es ) 
-{x(k° + Bi — B) sink, dz — n'y. k” cos x; de 
+ nye — nik + (62 — BJe 7” cos my di} - (51) 


The corresponding expression for the TE modes, equation (37) is 
apparently considerably simpler. 

The expression for the radiation loss of TM modes on a dielectric 
waveguide of arbitraty shape is obtained from 


AP =f tata f+ ate) (4) 2 ap (52) 
with the coefficient of the even radiation modes 
qe(p) = 
Ln — L) py'(B.8 cos od + yo sin cd) cos caer ( A — oh) : 
” 2(B, _ a) | g | (tk +. vi} (n?? cos’ od + sin’ od) 
(53) 
and the coefficient for the odd radiation modes 
qo(p) = 
. (nv — 1)py'?(BB sin od — yo cos od) cos xde*~8 ne 4 2h) ‘ 


0 2(B) — A)\ nb |B (a n’k? = AEs + vi)(n*ot sin’ od +55 cos’ ad) 
(54) 
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The restriction to symmetrical waveguides was dropped so that equa- 
tions (52) through (54) hold for waveguides of arbitrary shapes as 
shown in Fig. 4. A comparison of equations (53) and (45) shows im- 
mediately how equation (45) could be generalized to an arbitrary wave- 
guide shape. Corresponding expressions for the odd TE radiation modes 
could immediately be constructed by a comparison of equation (61), 
Ref. 1, with equation (54). The function h(z) describes the shape of the 
dielectric slab waveguide at the lower air-dielectric interface. The theory 
of dielectric slab waveguides with rough wall, as presented in Ref. 1, 
was limited to TE modes. The same procedure which lead from equations 
(45) to (46) allows us to derive the TM-mode radiation loss equations for 
waveguides with rough walls. 


(n® — 1)py'(608 cos od + yo sin cd) cos «dl o(8) — _¥(8) 
nk? 


Dit | 6 | (ao + ye + yd) (n'p? p cos’ od + sin’ od) } 
(55) 


qe(p) = 


} 


and 
_ MW? — Ipy'(B.8 sin od — yo cos od) cos adlo(8) + ¥(6)] 


Qo(p) = n2 ke? 
Dit |B | [jo ey + vd) (np? p sin’ od coe ne cos’ sd) 
(56) 


The Fourier component ¢(@) is given by equation (47). The correspond- 
ing Fourier component y(@) follows from equation (47) by replacing 
f(z) with h(z). 


V. NUMERICAL RESULTS 


The radiation losses caused by a symmetrical step with the ratio 
d./d, = 0.5 for n = 1.01 are shown in I'ig. 5. The solid curves are 
obtained from equation (42) with the help of equations (45) and (53) 


-—t (Z) 


¥ 
Suu VILL LLL LLLLLLLLE. 


GY ff LLiee YET 
Z=L 










A 
h(z)—~ 


Fig. 4— An asymmetrical wall distortion of the slab waveguide. 
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Fig. 5— TE and TM mode losses caused by a step in the slab waveguide. Solid 
line calculated from (42), (52) dotted line calculated from (32). = 1.01, d2/d;: 
= 0.5. 


by approximating the step with a steep linear taper of length L/d, = 1 
For very short tapers, the radiation loss is independent of the length of 
the taper. The dotted curves were obtained from equations (82) and 
(37) for TE modes and equations (48) through (51) for TM modes. The 
agreement between the results obtained by the two different methods is 
quite good. It is also apparent that TE modes and TM modes suffer very 
nearly the same losses in this case. It is surprising how low the radiation 
losses are in the region of kd, = 11. Both modes pass this considerable 
step with a power loss of less than 1 percent. For kd, > 20 the larger 
portion of the waveguide can support more than one guided mode. This 
is the reason why the loss curves were not extended past this point. Both 
the TE as well as TM modes show minimum loss values for particular 
values of kd, suggesting the possibility of optimizing waveguide steps. 
Fig. 6 shows the radiation losses of the even, lowest order TE and TM 
mode for a step on a single mode waveguide with n = 1.482. The TE 
and TM mode losses are quite different for this waveguide with high 
dielectric constant. The fact that for TE as well as TM modes there is 
an increasing discrepancy between the two methods of calculation for 
increasing values of kd, with the dotted curve for the TM modes even 
becoming negative, may indicate that the solid curves are more reliable. 
For small values of kd, the agreement between the two methods becomes 
quite good. The losses of the TM mode are generally higher than the 
TE mode loss. However, even in this case the TE mode loss can be 
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Fig. 6— Same as Fig. 5. n = 1.482. 


made approximately 1.percent while the TM mode loss can be as low 
as 2 percent if the step is used at its optimum point of operation. For 
kd, > 3 the larger waveguide section ceases to be single mode. 

The dependence of the TE-mode radiation losses on the ratio of the 
width d,/d, of the guide on either side of the step is shown in Fig. 7. 
This curve was computed from equations (82), (86) and (87). The 
dielectric constant of the waveguide material was chosen as n = 1.01 


\ 
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Fig. 7 —Step loss of TE mode as a function of the ratio do/di.n = 101, kd: 
= 100. 
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and kd, = 10 was used. It is apparent that the radiation losses increase 
rapidly as d./d, — 0. 

So far we have discussed the radiation losses of abrupt steps. The 
reduction of the TM-mode losses as the step is changed into a taper 1s 
seen in Fig. 8. This figure was calculated from equations (52) and (53) 
for a ratio of d./d; = 0.5 of the straight guide sections that are con- 
nected by a symmetrical linear taper. It is apparent that the linear 
taper needs to be quite long before a substantial improvement of the 
radiation loss is obtained. The actual length of an effective taper need 
not be very large. The length of the taper is represented in Fig. 8 as the 
ratio of its actual length to the half width d, of the thicker waveguide 
section. Extrapolating the result of Fig. 8 to a value of L/d, = 100 
appears to lead to a loss reduction to approximately 1/10 of the loss 
of the abrupt step. With \ = ly we find that kd, = 1 corresponds to 
d, = 0.16 » so that L/d, = 100 corresponds to L = 16 un. 

It appears that there are more effective shapes than linear tapers. 
Equations (45), (53) and (54) show that the loss of a taper is essentially 
determined by two factors, the magnitude of the derivatives df/dz and 
dh/dz and the value of 8) — 6. Rapid oscillations of the function 
exp [7(8) — 6)z] cause the value of the integral to be small. The largest 
value of Bis 8 = k. The worst value appearing in the argument of the 
exponential function is, therefore, 8, — k. The propagation constant of 
the guided mode depends on the width of the waveguide and is therefore 
a function of z. The optimum taper, that is intended to connect two 








Fig. 8—'TM mode ee oe as a function of the length L of the taper. 
n = 1432, kdh = 10, o/c = 
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different waveguides in a given length, would attempt to use larger 
values of df/dz and dh/dz on the wide part of the taper where 8) — k 
is still larger and provide smaller values of these derivatives on its 
narrow part where 8, — / is smaller. A linear taper radiates more on its 
narrower portion where the field is less tightly guided. An optimum 
taper would attempt to distribute the radiation loss uniformly over the 
length of the taper. 


VI. RANDOM WALL DISTORTION 


In Ref. 1 we computed the losses of the lowest order guided TE 
mode that is caused by random distortions of one of the two waveguide 
walls. For the sake of completeness we include here the corresponding 
formula for TM modes which can be immediately obtained from the 
theory presented in Ref. 1 and our present equations (55) and (56). The 
ensemble average of the relative power loss of the lowest order even TM 
mode (caused by the distortion of one wall by a random process whose 
correlation function is a simple exponential function, equation (85) of 
Ref. 1) with r.m.s. deviation A and correlation length B is given by 


(ar) A*yL(n* — 1)° p COS’ ky d 
In BB. a | |e nk? | 
“lane + 73 arg Banya 
| Gob cos od + yo sin od)’ 4 Bob sin od — yo cos ody dp 
\n?p? cos’ od + m3 sin’ a np’ sin’ ad Tae a cos’ ad| 
(57) 


The radiation loss that is obtained from this equation is shown in Figs. 
9 and 10, by the solid lines. The dotted curves are reproduced from 
Ref. 1 and give the loss of the TE mode for comparison. The curves 
labeled AP~/AP”™ show the ratio of backward to forward scattered 
power. The conclusion to be drawn from these curves is that the TM 
mode losses caused by small random wall perturbation are very nearly 
the same as for TE modes. Neither type of mode seems to offer a distinct 
advantage. 

The radiation losses of slab waveguides with random wall distortions 
are representative of the losses of round waveguides with similar wall 
distortions. However, the radiation losses of slab waveguide tapers are 
considerably lower than those of round waveguides. (A discussion of 
round waveguides will be published.) 
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Fig. 9— Comparison of TM loss (solid line) and TE loss (dotted line) caused 
by a random distortion of one waveguide wall. B = correlation length, A = 
rms. wall distortion, d = half width of slab, L = length of distorted guide 
section. n == 1.5, kd = 13. 
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Fig. 10 — Same as Fig. 9.n = 1.01, kd = 80. 
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VII. CONCLUSION 


We have derived radiation loss formulae for the dominant mode 
dielectric slab waveguide. The losses for steps and tapers in the wave- 
guide were calculated for TE as well as TM modes. The theory of radia- 
tion losses for random wall imperfections, that was developed earlier for 
TE modes, was extended to TM modes. 

The radiation losses of abrupt steps with a 2:1 ratio were found to 
be surprisingly low (a few percent). The advantage of gradual linear 
tapers over abrupt steps becomes appreciable only if the taper is much 
longer than the width of the slab. 

The losses of steps and tapers of the slab waveguide are exceptionally 
low. Dielectric waveguides with round and rectangular cross sections 
have considerably highest losses. However, the method of describing 
waveguide distortions as successions of abrupt steps is applicable to 
all dielectric waveguides and simplifies their treatment considerably. 
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An Efficient Heuristic Procedure 
for Partitioning Graphs 


By B. W. KERNIGHAN and 8S. LIN 
(Manuscript received September 30, 1969) 


We consider the problem of partitioning the nodes of a graph with costs 
on its edges into subsets of given sizes so as to minimize the sum of the costs 
on all edges cut. This problem arises in several physical sitwations—for 
example, in assigning the components of electronic circuits to circuit 
boards to minimize the number of connections between boards. 

This paper presents a heuristic method for partitioning arbitrary graphs 
which rs both effective in finding optimal partitions, and fast enough to be 
practical in solving large problems. 


I. INTRODUCTION 


1.1 Definition of the Problem 


This paper deals with the following combinatorial problem: given a 
graph G with costs on its edges, partition the nodes of G into subsets no 
larger than a given maximum size, so as to minimize the total cost of 
the edges cut. 

One important practical example of this problem is placing the com- 
ponents of an electronic circuit onto printed circuit cards or substrates, 
so as to minimize the number of connections between cards. The com- 
ponents are the nodes of the graph, and the circuit connections are the 
edges. There is some maximum number of components which may be 
placed on any card. Since connections between cards have high cost 
compared to connections within a board, the object is to minimize the 
number of interconnections between cards. 

This partitioning problem also arises naturally in an attempt to 
improve the paging properties of programs for use in computers with 
paged memory organization. A program (at least statically) can be 
thought of as a set of connected entities. The entities might be sub- 
routines, or procedure blocks, or single instruction and data items, 
depending on viewpoint and the level of detail required. The connections 
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between the entities might represent possible flow or transfer of control, 
or references from one entity to another. The problem is to assign the 
objects to “pages” of a given size so as to minimize the number of 
references between objects which lie on different pages. 

To pose the partitioning problem mathematically, we shall need the 
following definitions. Let G be a graph of n nodes, of sizes (weights) 
w,>0,7=1,---,7. Let p bea positive number, such that 0 < w; = p 
for allz. Let C = (¢,;), 7,7 = 1, --+ ,n bea weighted connectivity matrix 
describing the edges of G. 

Let k be a positive integer. A k-way partition of G is a set of nonempty, 
pairwise disjoint subsets of G, v,, --- , v, such that U*%_, o; = G. A 
partition is admisszble if 


lv.| Sp forall 2, 


where the symbol | x | stands for the size of a set x, and equals the sum 
of the sizes of all the elements of x. The cost of a partition is the summa- 
tion of c;; over all ¢ and j such that 7 and 7 are in different subsets. The 
cost is thus the sum of all external costs in the partition. 

The partitioning problem we consider here is to find a minimal-cost 
admissible partition of G. 

There are three other problems which are equivalent to this one. 
First, minimizing external cost is equivalent to maximizing internal 
cost because the total cost of all edges is constant. Further, by changing 
the signs of all c,;’s, we can maximize external cost, or minimize internal 
cost. 


1.2 Hxact Solutions 


A strictly exhaustive procedure for finding the minimal cost partition 
is often out of the question. To see this suppose that G has nm nodes of 
size 1 to be partitioned into k subsets of size p, where kp = n. Then there 
are (") ways of choosing the first subset, (";”) ways for the second, 
and so on. Since the ordering of the subsets is immaterial, the number 


pees 1 (n=)... (2)(r), 


For most values of n, k, and p, this expression yields a very large num- 
ber; for example, for n = 40 and p = 10 (k = 4), it is greater than 10”. 
Formally the problem could also be solved as an integer linear pro- 
gramming problem, with a large number of constraint equations neces- 
sary to express the uniformity of the partition. 
Because it seems likely that any direct approach to finding an optimal 
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solution will require an inordinate amount of computation, we turn to 
an examination of heuristics. Heuristic methods can produce good 
solutions (possibly even an optimal solution) quickly. Often in practical 
applications, several good solutions are of more value than one optimal 
one. 

The first and foremost consideration in developing heuristics for 
combinatorial problems of this type 1s finding a procedure that is power- 
ful and yet sufficiently fast to be practical. A process whose running 
time grows exponentially or factorially with the number of vertices of 
the graph is not likely to be practical. In most cases, a growth rate of 
more than the square of the number of vertices is still not too practical. 
(If the running time of a procedure grows as f(n), where 7 is the number 
of vertices involved, we shall refer to it as an f(n)-procedure.) 


1.3 False Starts 


To point out a few pitfalls, we mention some unsuccessful attempts at 
heuristic solutions to the partitioning problem. 


1.3.1 Random Solutions 


One tactic is simply to generate random solutions, keeping the best 
seen to date, and terminating after some predetermined time or value 
is reached. This is quite fast, although actually an n’-procedure. Un- 
fortunately, this approach is unsatisfactory for problems of even 
moderate size, since there are generally few optimal or near-optimal 
solutions, which thus appear randomly with very low probabilities. 
Experience with 2-way partitions for a class of 0-1 matrices of size 
32 X 32, for example, has indicated that there are typically 3 to 5 
optimal partitions, out of a total of 4 (34) partitions, giving a probability 
of success on any trial of less than 107’. 


1.3.2 Max Flow-Min Cui 


Another partitioning method is the Ford and Fulkerson max flow-min 
cut algorithm’. The graph is treated as a network in which edge costs 
correspond to maximum flow capacities between pairs of nodes. A cut 
is a separation of the nodes into two disjoint subsets. The max flow-min 
cut theorem states that the maximal flow values between any pair of 
nodes is equal to the minimal cut capacity of all cuts which separate 
the two nodes. In our terminology, a cut is a 2-way partition, and the 
cut capacity is the cost of the partition. The Ford and Fulkerson algo- 
rithm finds a cut with maximal flow, which is thus a minimal cost cut; 
this represents a minimum cost partition of the graph into two subsets 
of unspecified sizes. 
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There are several difficulties involved in using the Ford and Fulkerson 
algorithm for our partitioning problem. The most severe of these is the 
fact that the algorithm has no provision for constraining the sizes of the 
resultant subsets, and there seems to be no obvious way to extend it 
to include this. Thus if flow methods are used to perform a split, then 
further processing is necessary to make the resulting subsets the correct 
size. If the subsets are greatly different in size, then use of this algorithm 
will have produced essentially no benefit. Hence in spite of its theoretical 
elegance, the Ford and Fulkerson algorithm is not suitable for this 
application. (Note however, that since it does find the minimal cost 
unconstrained 2-way partition, the value it produces is a lower bound 
for solutions produced by any method.) 


1.3.3 Clustering 


A class of much more intuitive methods is based on identifying “‘nat- 
ural clusters’ in the given cost matrix—that is, groups of nodes which 
are strongly connected in some sense. For example, one can use very 
simple heuristics for building up clusters, based on collecting together 
elements corresponding to large values in the cost matrix. But again 
these methods do not in general include much provision for satisfying 
constraints on the sizes of the subsets, nor do they provide for syste- 
matic assignment of ‘stragglers’? (nodes which do not obviously belong 
to any particular subset). 


1.3.4 \-Opting 


Lin, working on the Traveling Salesman Problem, [See Ref. 2] cate- 
gorized a set of methods of improving given solutions by rearranging 
single links, double links, triplets, and in general, links. He referred 
to a change involving the movement of X links as a \-change. If a con- 
figuration of the system is reached in which no )-change can be made 
which results in a decrease in cost, the configuration is said to be ‘‘A-opt.”’ 

For the partitioning problem, an analogous operation is the inter- 
change of groups of \ points between a pair of sets. Thus a 1-change is 
the exchange of a single point in one set with a single point in another 
set. A configuration is then said to be “‘l-opt” if there exists no inter- 
change of two points which decreases the cost of the partition. Experi- 
ments to evaluate 1-opting for 2-way partitions of 0-1 matrices (82 X 32) 
within which about one-half of the elements were nonzero, show that 
apparently optimal values can be achieved in about 10 percent of the 
trials; values within 1 or 2 of the optimal can be achieved in about 75 
percent of cases. 
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It appears fruitless to extend \ beyond 1 (l-opting is already an 
n*-procedure), or to extend 1l-opting experiments to partitions into 
more than two subsets, since more powerful methods have been devel- 
oped. These methods are the topic of the next sections. 


Il. TWO-WAY UNIFORM PARTITIONS 


2.1 Introduction 


The simplest partitioning problem which still contains all the sig- 
nificant features of larger problems is that of finding a minimal-cost 
partition of a given graph of 27 vertices (of equal size) into two subsets 
of n vertices each. The solution of the 2-way partitioning problem is the 
subject of this section. The solution provides the basis for solving more 
general partitioning problems. In Section 2.6, we discuss 2-way partitions 
into sets of unequal size. 

Let S be a set of 2n points, with an associated cost matrix C = (c,;), 
1,7 = 1, --- , 2m. We assume without loss of generality that C is a 
symmetric matrix, and that c;; = 0 for all 2. There is no assumption 
about nonnegativity of the c;;’s. We wish to partition S into two sets 
A and B, each with n points, such that the ‘‘external cost”? 7’ = a Pergo 
is minimized. 

In essence, the method is this: starting with any arbitrary partition 
A, B of S, try to decrease the initial external cost T' by a series of inter- 
changes of subsets of A and Bb; the subsets are chosen by an algorithm 
to be described. When no further improvement is possible, the resulting 
partition A’, B’ is locally minimum with respect to the algorithm. We 
shall indicate that the resulting partition has a fairly high probability 
of being a globally minimum partition. 

This process can then be repeated with the generation of another 
arbitrary starting partition A, B, and so on, to obtain as many locally 
minimum partitions as we desire. 

Given S and (c,;), suppose A*, B* is a minimum cost 2-way partition. 
Let A, B be any arbitrary 2-way partition. Then clearly there are subsets 
XCA,YC Bwith|X{|=|Y| Ss n/2 such that interchanging X and 
Y produces A* and B* as shown below. 


© (® 
A B Ae B* 
* 


A=A-X4+Y 
B*=B-Y +X 
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The problem is to identify X and Y from A and B, without considering 
all possible choices. The process we describe finds X and Y approxi- 
mately, by sequentially identifying their elements. 

Let us define for each ae A, an external cost L, by 


Eo: > ex 


yeB 


and an internal cost I, by 


I,= ae 


zeA 


Similarly, define /, , J, for each be B. Let D, = LH, — I, for allze 8; D, 
is the difference between external and internal costs. 


Lemma i: Consider anyae A, be B. If a and b are interchanged, the 
gain (that is, the reduction in cost) 1s precisely D, + Dy — 2Car . 


Proof: Let z be the total cost due to all connections between A and 
B that do not involve a or 6. Then 


T=2+Hh, + By —- ca. 
Exchange a and 0; let T’ be the new cost. We obtain 
TO = 2 Tee tn 
and so 
gain = old cost — new cost = T — T”’ 
= D,+ Ds, — 2c, . 


2.2 Phase 1 Optimization Algorithm 


In this subsection we present the algorithm for 2-way partitioning. 
First, compute the D values for all elements of S. Second, choose 
a; 2A, 6; ¢ B such that 


gi = Da, + D,, an 2Ca;b; 


is maximum; a; and 6; correspond to the largest possible gain from a 
single interchange. (We will return shortly to a discussion of how to 
select a; and 6; quickly.) Set a; and b; aside temporarily, and call them 
a‘ and bf , respectively. 

Third, recalculate the D values for the elements of A — {a,;} and 
for B — {b;}, by 


D! — D, + rd Se — 2Czp; reA make {a;}, 
D{ = D, + 7d ee = 2 es y2B = {b;}. 
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The correctness of these expressions is easily verified: the edge (2, a;) 
is counted as internal in D, , and it is to be external in D/ , so c,,, must 
be added twice to make this correct. Similarly, c,,, must be subtracted 
twice to convert (2, 6;) from external to internal. 

Now repeat the second step, choosing a pair af, bf from A — $a‘} 
and B — {b/} such that g. = D,, + Di,» — 2ca,+s,- iS maximum (a/ 
and b/ are not considered in this choice). Thus g, is the additional gain 
when the points aj and bi are exchanged as well as a/ and Of ; this ad- 
ditional gain is maximum, given the previous choices. Set af and b/ 
aside also. 

Continue until all nodes have been exhausted, identifying (a{ , bf),---, 
(a/ , 6’), and the corresponding maximum gains g3;, -°° , g,. AS each 
(a’, b’) pair is identified, it is removed from contention for further choices 
so the size of the sets being considered decreases by 1 each time an 
(a’, b’) is selected. 

If X = af,aj3,°--,a,, Y = bf, 65, --- , bf , then the decrease in 
cost when the sets X and Y are interchanged is precisely g, + g. + -:- 
+- g,. Of course >|” g; = 0. Note that some of the g,’s are negative, 
unless all are zero. 

Choose k to maximize the partial sum >>*_, g; = G. Now if G > 0, 
a reduction in cost of value G can be made by interchanging X and Y. 
After this is done, the resulting partition is treated as the initial partition, 
and the procedure is repeated from the first step. 

If G = 0, we have arrived at a locally optimum partition, which we 
shall call a phase 1 optimal partition. We now have the choice of re- 
peating with another starting partition, or of trying to improve the phase 
1 optimal partition. We shall discuss the latter option shortly. Figure 1 
is a flowchart for the phase 1 optimization procedure. 


2.3 li ffectiveness of the Procedure 


One general approach to solving problems such as this one is to find 
the best exchange involving say \ pairs of points, for some X specified 
in advance’. The difficulty encountered is that use of a small value of 
\ is not sufficient to identify good exchanges, but the computational 
effort required grows rapidly as X increases. 

The procedure we have described sequentially finds an approximation 
to the best exchange of \ pairs. \ is not specified in advance, but rather 
is chosen to make the improvement as large as possible. This technique 
sacrifices a certain amount of power for a considerable gain in speed. 

Since we construct a sequence of gains g; ,7 = 1, --- , n, and find the 
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COMPUTE D VALUES FOR A AND FOR B 
p+ 
Ap +A, Bp=+-B 










SELECT a ¢€ Ap, bj € Bp SUCH THAT 
Dp = Da, +Dbj-2Ca,b, 1S MAXIMUM 


p=+pt 1 
UPDATE D VALUES 
FOR Ap, Bp 


CHOOSE K TO MAXIMIZE 


k 
G=)) 
i=] 





MOVE aj°-:aKTo B 
AND bj -*--bk TO A 


Fig. 1 — Flowchart of phase 1 optimization procedure. 


maximum partial sum, the process does not terminate immediately when 
some g; 1s negative. This means that the process can sequentially identify 
sets for which the exchange of only a few elements would actually in- 
crease the cost, while the interchange of the entire sets produces a net 
gain, 

Numerous experiments have been performed to evaluate the pro- 
cedure on different types of cost matrices. The matrices used have in- 
cluded (z) 0-1 matrices, with density of nonzero elements ranging from 
5 percent to 50 percent, (22) integer matrices with elements uniformly 
distributed on [0, £], k = 2, --- , 10, (zzz) matrices with clusters of 
known sizes and binding strength. Results on all of these matrices have 
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been similar, so we shall only summarize them here. A more extended 
discussion may be found in Ref. 3. 

A useful measure of the power of a heuristic procedure is the proba- 
bility that it finds an optimal solution in a single trial. Suppose that p 
is the probability that a phase 1 optimal solution found using a random 
starting partition is globally optimal. We have examined the behavior 
of this probability as the size of the matrices involved is varied. Experi- 
ments show 7 is around 0.5 for matrices of size 30 X 30, 0.2 to 0.3 for 
60 X 60, and 0.05 to 0.1 for 120 X 120. The functional behavior of p 
is approximately p(n) = 27°", 

These values are derived primarily from 0-1 matrices having about 
50 percent 1’s (randomly placed). Experiments on matrices with lower 
densities of 1’s yield larger variances, but substantially identical mean 
values for p. 


2.4 Running Time of the Procedure 


Let us define a pass to be the operations involved in making one cycle 
of identification of (af, b/), --- , (aZ, b/), and selection of sets X and 
Y to be exchanged. The total time for a pass can be estimated this way. 
First, the computation of the D values initially is an n*-procedure, since 
for each element of S, all the other elements of S must be considered. 
The time required for updating the D values is proportional to the 
number of values to be updated, so the total updating time in one pass 
gTOWS as 


(n—-At+(m—2)+--+-4+241 


which is proportional to n’. 

The dominant time factor is the selection of the next pair a‘, b{ to 
be exchanged. The method we have used to perform this searching is to 
sort the D values so that 


Da, 2 Da 


IV 
IV 


D. 


2 n 


and 


D,, 2 Ds, 2 


IV 


D, 


When sorting is used, only a few likely contenders for a maximum gain 
need be considered. This is because when scanning down the set of D,’s 
and D,’s, if a pair D,, , D,; 1s found whose sum does not exceed the 
maximum improvement seen so far in this pass, then there cannot be 
another pair a, , b; with k = 1,1 = j, with a greater gain, (assuming 
c;; 2 0) and so the scanning can be terminated. Thus the next pair 
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for interchange is found rapidly. Sorting is an n log n operation, so in 
this method, the total time required to sort D values in a pass will be 
approximately 

nlogn + (n — 1) log (n — 1) +--+ + 2 log 2 
which grows as n’ log n. 

To reduce the time for selection of an (a, b) pair, it is possible to use 
techniques which are faster than sorting, but which do not necessarily 
always give the maximum gain at each stage. For example, one method 
is to scan for the largest D, and the largest D, , and use the correspond- 
ing a and b as the next interchange. This method is essentially linear- 
time and would probably be implemented as part of the recomputation 
of the D values. It is best suited for sparse matrices, where the prob- 
ability that c,, > 0 is small. A slight extension, involving negligible 
extra cost, is to save the largest two or three D,’s and D,’s, so that if 
the largest pair does not give the maximum gain (because c,, is too 
large), then another can be tried. Experience indicates that three values 
are sufficient in virtually all cases, even for matrices with a relatively 
high percentage of nonzero entries. Use of this method reduces running 
time by about 30 percent in the present implementation, with very 
small degradation of power. 

The number of passes required before a phase 1 optimal partition is 
achieved is small. On all matrix sizes tested at the time of writing (up 
to 360 points), it has been almost always from 2 to 4 passes. On the basis 
of this experimental evidence, the number of passes is not strongly 
dependent on the value of n. 

From the foregoing observations, it is possible to estimate the total 
running time of the procedure. If we use a method which sorts the D 
values at each stage (time proportional to n” log n), then the running 
time should grow as n° log n. If a fast-sean method is used, and the 
number of passes is constant, the running time should have an n’ 
growth rate; this is a lower bound. 

For comparison, examination of all pairs of sets X and Y, and evalua- 
tion of the costs would require time proportional to 


n/2 2 2 én 2 
2 n nr n 
HX) ~Fz() 
_ (en 
~ 2\n 
5f() 
ee 2 mM 


for large n. This function grows as n*”"4". 


PARTITIONING GRAPHS 301 


Running times have been plotted in Fig. 2. The observed times have 
an apparent growth rate of about n***, which is reasonably close to 
n*. Although on the logarithmic plot this curve is close to linear over 
the range n = 20 to n = 180, it may actually be n° log n; insufficient 
data is available to check this. All times are based on an implementation 
in FORTRAN G on an IBM System 360 Model 65. 


2.5 Improving the Phase 1 Optimal Partition 


In this section, we discuss a method which might be used to improve 
the partition produced by the phase 1 procedure, which may not be 
globally optimum. The method suggested in this section is based heavily 
on experimental evidence, although there are quite plausible reasons 
for performing the particular set of operations. The basic idea is to 
perturb the locally optimal solution in what we hope is an enlightened 
manner, so that an iteration of the process on the perturbed solution 
will yield a further reduction in the total cost. If this tactic fails, nothing 
has been lost except some computation time, since the best solution 
seen so far is always saved. 

Computer results for problems with up to 64 points suggest that 
whenever a phase 1 optimal solution is not globally optimal, |X| = 
| Y | = n/2. Roughly, this implies that if | X | and | Y | had been small 


RUNNING TIME IN SECONDS 





MATRIX SIZE 


Fig. 2— Running time. 
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compared to n/2, they would have been found by the process; it is only 
larger sets which are not identified all the time. 

A successful heuristic to find the correct X and Y in this case is to 
find a phase 1 optimal partition for each of the sets A and B, say A — 
[A,, A2] and B — [B,, B,]. (That is, find near-optimal partitions of A 
and of B separately.) Recombine the 4 sets into 2, say Ay = A, U B, 
and B, = A. U B,, and continue with phase 1 optimization. If our 
expectation is correct, the new X and Y will be small, and thus readily 
identified by the phase 1 process. 

When A is split into A, , A, and B into B, , B, there are two ways in 
which the smaller sets can be recombined. A series of tests was made on 
matrices of moderate size (up to 64 * 64), in which both possible recom- 
binations were done, generating three phase 1 optimal values for each 
starting partition. For matrices of size 32 X 32, the apparent optimal 
value was observed at least once in each triple of values, for a large 
number of cases. With matrices of size 64 X 64, there were occasional 
failures. 

It might be noted that the extra time involved for the recombination 
approach is three times that required to do a completely new partition 
from a random start, assuming an n’-procedure. 

It is possible to estimate whether a particular improvement tactic 
is profitable or not in the following way. Suppose that some method 
increases the probability of finding an optimal partition from p to p’, 
while it increases the running time from ¢ to ¢’. Then in a fixed amount 
of time, it is possible to do k trials of the basic procedure, and kt/t’ 
trials of the improved method. The corresponding probabilities of 
achieving an optimal solution are 1 — (1 — p)* and 1 — (1 — p’)**”* 
respectively. The improved method is then desirable if the second ex- 
pression is greater than the first; by simple manipulation, this condition 
becomes 


l—p’<(1— p)'”. 


On the basis of the numerical values in this section, it may be useful to 
try the recombination method. 


2.6 Partitioning into Unequal-Sized Sets 


It is simple to modify the procedure to partition a set S with n ele- 
ments into two sets of specified sizes n, and n2(n, + n. = n). Assume 
NM, < nz. Then restrict the maximum number of pairs that can be ex- 
changed in one pass of the procedure to n,. All other operations are 
performed on all elements of each set. (The starting partition is into 
two sets, of n, and 2 elements respectively.) 
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Suppose we wish to partition S into two sets, such that there are at 
least n, elements and at most n_. elements in each subset; n; + ne = n, 
but they are not specified further. 

The procedure is easily modified to handle this sort of constraint 
by the addition of ‘“dummy”’ elements. These are elements which have 
no connections whatsoever; that is, they have zero entries in the cost 
matrix wherever they appear. Add 2n. — n dummies so S has 2n,z 
elements, and perform the procedure on it. The resulting partition will 
assign the dummy elements to the two subsets so as to minimize the 
external cost; at this point the dummies are discarded, leaving a parti- 
tion into two subsets that satisfy the size constraints given. 


2.7 Elements of Unequal Sizes 


We have made the assumption so far that the elements (vertices) of 
the graph are all of the same size. This requirement may be relaxed to 
a large extent by converting any node of size k > 1 to a cluster of k nodes 
of size 1, bound together by edges of appropriately high cost. The size 
of the problem will obviously increase proportionally to the value of k, 
so it may be necessary to sacrifice some accuracy to keep the number of 
generated nodes within reasonable bounds. 


IT. MULTIPLE-WAY PARTITIONS 


3.1 Reduction to 2-Way Partitioning Problem 


So far, the discussion has been concerned exclusively with the basic 
problem of performing a 2-way partition on a set of 2n objects. In this 
section we extend the technique to perform k-way partitions on a set of 
kn objects, using the 2-way procedure as a tool. 

The essential idea is to start with some partition into k sets of size 
n and by repeated application of the 2-way partitioning procedure to 
pairs of subsets, make the partition as close as possible to being pairwise 
optimal. (Section 3.2 treats the question of what starting sets to use.) 
Of course pairwise optimality is only a necessary condition for global 
optimality. There may be situations where some complex interchange 
of three or more items from three or more subsets is required to reduce 
a& pairwise optimal solution to globally optimum; at the moment, no 
reasonable method for identifying such sets is known. 

There are (5) pairs of subsets to consider, so the time for one pass 
through all pairs is (assuming an n-procedure) ()n? ~& (kn)’?/2 = 
(number of points)’/2. In general, more passes than this will actually 
be required, since when two sets are made optimal, this may change their 
optimality with respect to other sets. 
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Experience indicates that the number of passes 1s small and the process 
converges quickly. For example, our algorithm selects (2, 7) as the next 
pair of sets to be optimized, where either 7 or 7 has been changed since 
the last time the pair (2, 7) was selected. Using this selection process, 
the average number of passes through each pair of sets is a slowly grow- 
ing function of both k and n. For matrices of size 100 or less and k < 6, 
the number of passes has been less than 5. [The average number of 
passes is computed as the average number of pairs considered to reach 
pairwise optimality, normalized by (5).] 

In any particular trial, there is a correlation between the number of 
pairs selected and the quality of the final partition. To get a better 
solution requires more work. 

Convergence is rapid: two passes account for more than 95 percent 
of the improvement in most cases; the remaining passes contribute only 
small further reductions. Let p(n, k) be the proportion of minimum cost 
solutions found for a particular 7 and k. For k fixed and small compared 
to n, the functional behavior of p(n, k) is similar to the case k = 2, but 
the actual values are lower. Roughly, we observe p(n, k + 1) } p(n, k) 
for k in the range 2-4, and n up to 100, with considerable variation 
depending on the matrix being tested. For instance, for matrices of 
size about 40, p(40, 2) = 0.4, p(42, 3) = 0.2, and p(40, 4) + 0.1. 

Another interesting question is measurement of how close to optimum 
the partitions found are. The solutions obtained by pairwise optimiza- 
tion have values concentrated in a narrow range. In almost all cases, the 
largest value found by the procedure is within 4-5 percent of the smallest. 
As another measure, if c is the mean cost of random partitions and b is 
the cost of the best partition observed, then virtually all partitions 
found have values v such that 


v—b SX 0.1(c — b). 


For instance, one test case was a series of 4-way partitions of a 0-1 
matrix of size 80. This matrix had 1278 nonzero entries (a density of 
0.2), corresponding to 639 edges in the graph. The mean value of 
randomly chosen partitions was 480.6. Twenty-four partitions of this 
matrix were found using the method described above. The lowest 
value encountered was 352 (1 time), the highest 365 (1 time); the mean 
value was 359.5, the median 360. 


3.2 Starting Partition 


In this subsection we discuss various methods of generating good 
starting partitions, based on modifications of the basic procedure. 
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The primary reason for choosing good starting partitions is that this 
particular form of preprocessing reduces the amount of work required 
to make the system pairwise optimal. It may also make the probability 
of an optimal solution higher, although this tendency is very difficult to 
evaluate. 

Several methods for finding good multi-way starting partitions which 
are based on repeated application of the procedure itself have been 
investigated. The essential idea is to generate a k-way starting partition 
by first forming an r-way partition, then an s-way partition on each of 
the resulting subsets, and so on, up to t-way. (Here k = rs --- t#.) The 
partitions found this way will in general be better than those which are 
completely arbitrary. A pairwise optimization stage is applied to the 
final set of subsets. 

For example, if k is a power of 2, then perform a 2-way split, then a 
2-way split on each of these subsets, and so on until the desired size of 
subsets is found. 

This general approach is prone to the following difficulty: the first 
split divides the original set into 7 subsets by trying to make the internal 
connections in each subset as large as possible. Obviously this may con- 
flict directly with the next stage, which is to try to divide each subset 
further. Carried to several levels, it can lead to a relatively poor overall 
solution. In experiments with 4-way partitions of matrices of sizes up to 
64 X 64, this method yields optimal solutions approximately as often 
as does starting with a 4-way partition in the first place. In addition, 
this method will be effective if the matrix happens to have natural 
clusters of approximately the correct size (that is, equal to the final 
subset size). 

A second method which can be used is to partition the set of kn ele- 
ments into a set of n and a set of (k — 1)n, using the slightly modified 
version of the basic procedure discussed in the first part of Section 2.6. 
The set of n elements is set aside, and the next n elements from the 
remaining (k — 1)n are identified. This continues until k subsets have 
been formed; again the pairwise optimization technique is used to 
improve on this partition. 

This method can make an error in the identification of the first set 
which will bias the choice of the second, and so on; the effect is most 
severe for the case where k is large, so each set is small. 

The method of breaking off subsets sequentially has another potential 
flaw: regardless of the starting configuration, it will identify approxi- 
mately the same set each time it is used on a particular problem, and 
hence little is gained by using it twice on one cost matrix. However 
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variations in the order of performing pairwise optimizations can still 
produce different final partitions in general. 

Limited computational experience with sequential break-off followed 
by pairwise optimization suggests that it yields solutions which are on 
the average at least as good as (and sometimes slightly better than) 
those provided by pairwise optimization applied to an arbitrary k-way 
starting partition. Pairwise optimization yields the optimum with a 
higher probability, however, because it is less susceptible to error caused 
by a bad choice made early. For instance, in tests on the 80 point matrix 
mentioned previously, sequential break off yielded 4-way solutions with 
a mean value of 358.6, but the lowest value found was 355. (The highest 
was 363.) These may be compared to 359.5, 352 and 365 for the standard 
partitioning method. 

Running time for the sequential break-off method is lower than for 
straight pairwise optimization. 

Insufficient data is available for a direct comparison between se- 
quential break off and the method of repeated subdivision. 

In all cases, the original process, be it a completely random generation 
of some initial configuration, or the production of a good starting parti- 
tion, is followed by a pairwise optimizing phase. It is unlikely that using 
better starting partitions will lead to worse results than random starts, 
on the average. Whether the possible improvement in results and running 
times will justify the extra computational effort required to generate the 
starting partition depends on the characteristics of the particular class 
of matrices being studied. 

Some limited experiments were performed to compare the present 
procedure with a multi-dimensional scaling technique’, on a Boolean 
matrix of 316 points, with about 1400 nonzero entries. The results in- 
dicated that the procedure identifies clusters well, even when no attempt 
is made to provide a good starting partition. 


3.3 Expansion Factor 


The introduction of dummy elements was mentioned in Section 2.6 
as a method of handling partitioning into subsets of unequal sizes. This 
can be viewed equally well as a means of introducing ‘‘slack’’ into a 
solution, in an attempt to get a lower overall cost by allowing ‘‘ex- 
pansion.” That is, so far we have treated the problem of finding a parti- 
tion with a constraint on the sizes of the subsets, and on the number of 
subsets, since given kn points, we have tried to find the best partitions 
into exactly k subsets of » points each. Suppose we now relax this 
second constraint by permitting the addition of dummy elements to in- 
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Fig. 3 — Cost reduction by expansion. 


crease the size of the problem, and attempt to find the best solution in- 
volving any number (greater than or equal to k) of subsets, with at most 
n points in each. This solution with k or greater subsets will in general 
have a lower cost than the constrained solution. 

Figure 3 shows an example in which introducing slack permits a 
lower overall cost. Assume 7 is 3 and all nodes are size 1. The vertical 
edges have cost 1 and the horizontal ones cost 2. Any partition into 2 
equal subsets has a cost of at least 3, but there is an obvious partition 
into 3 subsets with cost 2. Any nontrivial partition into 4 or more sub- 
sets has a cost greater than 2, so 3 subsets represents the optimal ex- 
pansion. It is possible to find the minimal cost solution and the cor- 
responding optimal amount of expansion as follows. Suppose the problem 
has kn points to be partitioned into k sets of n points each. Starting with 
no slack (kn points), the optimal assignment is found. Then 7 dummies, 
enough to create one extra subset, are added, making a (k + 1)n prob- 
lem, and so on. Eventually, one subset is produced which consists 
entirely of dummies. When this occurs, we take the partition with this 
set of dummies removed as our optimum solution. 
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Laser Speckle Pattern— 
A Narrowband Noise Model 


By CHRISTOPH B. BURCKHARDT 
(Manuscript received September 29, 1969) 


We represent by an electrical model the imaging of a one-dimensional 
coherently illuminated and diffusely reflecting surface by an optical system 
with a rectangular aperture. We then obtain the statistical properties of 
the wmage intensity from the statistical properties of the square of the en- 
velope of a narrowband noise signal in the electrical model. The analysis 
is simple because use can be made of resulis known in communication 
theory. The results agree with those obtained in a direct way. 


I. INTRODUCTION 


The speckle pattern in the image of a coherently illuminated and 
diffusely reflecting object has been analyzed by Enloe’. Enloe’s results 
show a remarkable similarity to some results occurring in the theory of 
narrowband noise (See Ref. 2, pp. 397 ff.). This similarity prompts the 
question whether Enloe’s results can be derived by the use of an elec- 
trical model analogy involving narrowband noise.* In this paper we will 
show that this is indeed possible for the special case of a one-dimensional 
subject and an optical system with a rectangular aperture. The analysis 
is simple because we can use results known in communication theory 
and the results agree with those of Enloe’. 


Il. THE OPTICAL MODEL 


The optical model is shown in Fig. 1. In plane P, there is a coherently 
illuminated row of scatterers which scatter with random phase (a one- 
dimensional diffusely reflecting surface). At a distance d in plane Ps 
there is a lens. In front of the lens there is a rectangular aperture with 
an amplitude transmission H (x, , y2). The distance d is large compared to 
the focal length f of the lens and therefore to a good degree of approxima- 


* Such an analogy was already suggested by Rigden and Gordon.3 
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Fig. 1 — Optical model. 


tion the image of the scatterers in plane P, is situated in plane P; at a 
distance f from the lens. Enloe computed the intensity as well as the 
autocorrelation of the intensity in the image plane.’ 

As a help in understanding the following electrical model we make 
the following comments regarding the optical model. Suppose that the 
instantaneous value of the electric field e(z, , ¢) in the object plane P, 
is given by 


e(x, , t) = a( =] exp (—2zrjrol). (1) 
e(a, , t) is zero for y, + 0. The time-independent phasor is 
a( **) = ala), (2) 
where we have introduced the spatial frequency coordinate 
x 
a= vi (3) 


(We write a( ) as a function of the spatial frequency x,/dd and not of 
x, in order to simplify the following computation.) For later use we also 
introduce 


Le Gi 
Since the lens is situated in the far field of the object, the phasor A (22 , Ye) 
in plane P, in front of the aperture is the Fourier transform of a(a:), 


A(z , Y2) = . f- A(a,) exp (2rja,%2) exp (277B,Y2) doy dB, . (5) 
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Since a(a;) is zero for 8; ¥ 0 the above two-dimensional Fourier trans- 
form operation reduces to the one-dimensional Fourier transform 
operation 


a 


A(te , Ye) = / . a(a,) exp (2rjo,X,) da, . (6) 


It is seen that A(xz, ye) is a function of x, only and we will therefore 
write A(x.). The phasor behind the aperture, B(x. , yz) is obtained as 
the product of A(x.) and the aperture transmission function H (2.2, y2) 


B(x2, Y2) = A(X2)*H (xe , Ye). (7) 


As was mentioned we assume a rectangular aperture function. We 
assume that H(2x2, ye) can be written as 


(x2 , Y2) = H,(x2)-H2(y2). (8) 


Since our object is one-dimensional we are only interested in the image 
intensity at ys = 0. The image intensity at yz = 0 is not influenced by 
H.(y2) except for a factor which remains constant over all 73. For 
equation (7) we can therefore write 


B(x2) = A(x2)-H, (x2) (9) 


with the understanding that B(x.) does vary in the y- direction but that 
this variation is of no interest to us. So much for the optical model. 
We will now present the electrical model. 


III. THE ELECTRICAL MODEL 


The electrical model is shown in Fig. 2. The electric field in the object 
plane is scanned by a detector whose output voltage is proportional 
to the instantaneous value of the electric field in the object plane. (In 
the optical region there are no such detectors available. This should 
not present any conceptual difficulties since we can scale up our model to 
a longer wavelength where there are such detectors.) 


P, Ps 











FILTER DISPLAY OF 


ENVELOPE 
Vo (t) / 





N 
“DETECTOR 


Fig. 2— Electrical narrowband noise model. 
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The electric field e(z, , ¢) in plane P, is given by equation (1) and the 
output voltage v;(¢) of the scanning detector is 


v,(f) = c,a( 24) exp (—2rjol). (10) 


v; 1s the scanning velocity and we have 
Ly = Vil. (11) 


c, denotes a constant of proportionality which is of no interest. The 
Fourier transform of v;(t) of equation (10), F7[v;(t)] is given by 


FT(v.(t)] = ¢ ae al (ye ¥9 | 


V1 


A(v) is the Fourier transform of a(t). c. is a constant of proportionality. 
The Fourier transform or spectrum of v;(¢) is centered at v>) as shown in 
Fig. 3. In the optical model the spectrum A(x.) is multiplied by the 
aperture transmission function H(z.) in plane P.. See equation (9). 
In order to simulate this in our electrical model we have to pass the 
voltage v;(t) through a temporal frequency filter with the frequency 
response H,[(Ad/v,)(v — v)]. The spectrum B[(\d/v,)(v — vo)] at the 
output of the filter is then given by 


[te-n]~afo—w]afo—n] os 


The filter H,[(Ad/v,)(v — v9)] is shown in the electrical model of Fig. 2 
and its frequency response is sketched in Fig. 3. (The frequency response 
H,[(\d/v,) (v — vo)] in the electrical model corresponding to the rectangu- 
lar aperture transmission function of the optical system can only be 


| 


(12) 


SPECTRUM AND 
FREQUENCY 
RESPONSE OF FILTER 





Vo Vv 


Fig. 3— Spectrum of the electrical signal and frequency response of the filter. 
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realized with a time lag, but this need not disturb us.) At the output of 
the filter we have the voltage v,(t) 


V,t 


V(t) = csb( 4) exp (—2rjrol), (14) 


where b(é) is the Fourier transform of B(y). c; again is a constant of 
proportionality of no interest. The output device scans the image plane 
P, with a velocity v. and 


v 


Vy 


(15) 


— 
— 


Qi 


because in the optical model the image is demagnified by a factor d/f 
with respect to the object. The inversion of the optical image with re- 
spect to the object is accounted for by inverting the coordinate system 
in the image plane P; , see Fig. 1. We can now write for equation (14) 


v(t) = c3b Yel exp (—2rjvot) 
Cs) i” 


| 


cb(2) exp (—2rrjvol). 
In the optical model we detect the intensity in the image plane, that is, 
the square of the absolute value of the phasor. Therefore in our electrical 
model the output device displays | b(23/Af) |”. 

The voltage v,(¢) at the output of the filter is narrowband compared 
to v). We now consider v,(¢) as a narrowband noise voltage. Another 
equivalent representation is (see, for example, Ref. 2, p. 397) 


v(t) = N,(t) cos 2rvot + N,(t) sin 2rvol, (17) 


nso = nfo) 
sso = m [0] 


Re [ ] means “real part of’? and Im [ ] means ‘imaginary part of’. 
We now assume that the filter is so narrowband that its impulse response 
is much wider than the time over which »,(t) shows any appreciable 
correlation. (This assumption is the equivalent of Enloe’s assumption 
that the optical spread function is much wider than the average distance 
between the images of scatterers with independent phase.) If this 


where 


(18) 
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assumption holds, the voltage v,(t) results from many independent 
values of v,(¢). According to the central limit theorem v,(¢) and therefore 
N.(t) and N,(t) show a Gaussian distribution. The voltage v,(¢) therefore 
has the statistical properties of narrowband Gaussian noise which are 
discussed for example in Ref. 2, pp. 397 ff. As was mentioned, the square 
of the envelope corresponds to the optical intensity. The envelope £(é) 
is obtained as 


E(t) = (Nit) + NiO}? 


2 2\ 3 (19) 
Vol Vol - 
= (pre Lo(55) IF + om LGD) HF) 

According to Ref. 2, equation (9.18), H(t) has a Rayleigh probability 

distribution density W (£) 
1 fee Fh 
W(L) = Sepia ey (20) 
The average value of the square of the envelope (Z”),, which is equal to 


the average value of the optical intensity (J),, is given by [Ref. 2, Eq. 
(9.21a)] 


(Bae aad (L) av z 2y. (21) 
We are now also interested in the autocorrelation R(s) of the intensity J 
ROG) = (L(t) T(ts + 8) a} 


= (E'(as)E"(as + 8))av - 


According to Ref. 2, equation (9.24), R(s) is given by the following 
expression 


R(s) = 4y°[1 + ko()] 
= (1°).[1 + ko(s)], 


where we have used equation (21). As explained in Ref. 2, [equations 
(9.10b and 9.12b)] k(s) is the autocorrelation of the spread function 
corresponding to a filter with the frequency response H,((Ad/v,)y). 
This is the frequency response of the filter in our electrical model (see 
Fig. 2), but centered at zero frequency. The frequency response H,((Ad/v,)v) 
is shown in Fig. 4. k.(s) is normalized to one at s = 0. (The above dis- 
cussion holds when the frequency response of the filter is symmetrical 
about the origin as is true for our case.) The spread function is the 


(23) 
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Fig. 4— Frequency response of the filter H1[(\d/v1) v]. 
Fourier transform of the frequency response and so we have 


o| a ( | = en 2 2 (24) 


where H,(v) and h(é) are Fourier transform pairs. c, is a constant of no 
interest. Therefore we obtain for k,(s) 


hoes 7 hi ene: a ur) di 








p(0) Af/ \Mf oe 
VoT § 
_ oe _ A) 
p(0) p(0) ’ 
where 
a } h* (h(t + u) db. (26) 


Using equations (23) and (25) we finally obtain for R(s) 


“() 


K(s) = Al 0) |" (27) 


This last equation is the same as Enloe’s equation (14) if the last term 
in that equation is disregarded. The last term in Enloe’s equation (14) 
vanishes if the spread function of the filter is much broader than the 
average distance between images of scatterers with independent phase. 
The power spectrum follows from the autocorrelation function in the 
same way as in Iinloe’s analysis and this derivation will not be repeated 
here. 

In summary: We have derived the statistical properties of the intensity 
in the image of a one-dimensional coherently illuminated and diffusely 
reflecting subject with the help of a narrowband noise model. Use was 
made of results known in communications theory. 





316 THE BELL SYSTEM TECHNICAL JOURNAL, FEBRUARY 1970 


REFERENCES 


1. Enloe, L. H., ‘“Noise-like Structure in the Image of Diffusely Reflecting Objects 
in Coherent Illumination,” BS.TJ., 46, No. 7 (September 1967), pp. 1479- 
1489. 

2. Middletown, D., An Introduction to Statistical Communication Theory, New 
York: McGraw-Hill Book Co., 1960. 

3. Rigden, J. D., and Gordon, E. I., “The Granularity of Scattered Optical Mascr 
Light,” Proc. I.R.E., 60, No. 11 (November 1962), pp. 2367-2368. 


Contributors to This Issue 


CuristorpH B. BurckHarpt, Dipl.-Ing., 1959, Dr. sc. techn., 1968, 
Swiss Federal Institute of Technology; Bell Telephone Laboratories, 
1963—. Initially Mr. Burckhardt was engaged in the analysis of varactor 
frequency multipliers. Since 1965, he has been working in holography. 
Member, IEEE, Optical Society of America. 


SHLoMO HauFin, M.Sce., 1958, and Ph.D., 1962, The Hebrew Univer- 
sity of Jerusalem (Israel); Bell Telephone Laboratories, 1968—. Mr. 
Halfin is working on theoretical problems in the area of mathematical 
programming and on the development of special purpose optimization 
algorithms. Member, American Mathematical Society, Operations Re- 
search Society of America, and the Society for Industrial and Applied 
Mathematics. 


LELAND B. Jackson, S.B. and 8.M., 1963, Massachusetts Institute 
of Technology; Bell Telephone Laboratories, 1961-62, 1966—. Mr. 
Jackson was first associated with Bell Laboratories under the M.I.T. 
cooperative program in electrical engineering. Since 1966 he has been 
primarily concerned with the analysis and synthesis of digital filters 
and related systems. He has completed the requirements for the Sc.D. 
degree from Stevens Institute of Technology. Member, IEEE, Tau 
Beta Pi, Eta Kappa Nu, Sigma Xi. 


Norman D. Kenyon, B.A., 1963, M.A., 1966, and Ph.D., 1967, 
Cambridge University, England; Bell Telephone Laboratories, 1968—. 
Mr. Kenyon is currently working on circuit structures for millimeter- 
wave IMPATT diodes. Member, IEEE. 


Brian W. KernicHan, B.A.Sc., 1964, University of Toronto; Ph.D., 
1969, Princeton University; Bell Telephone Laboratories, 1969—. Mr. 
Kernighan is working primarily on graph-partitioning and its applica- 
tions and extensions. Member, Association for Computing Machinery. 


Ronautp W. Korpos, B.E.E., 1957, University of Detroit; M.S.E.E., 
1959, Northeastern University; Bell Telephone Laboratories, 1957—. 
Mr. Kordos has engaged in the design of microwave ferrite devices, 


317 


318 THE BELL SYSTEM TECHNICAL JOURNAL, FEBRUARY 1970 


including field-displacement and resonance isolators for several micro- 
wave radio relay systems, and a passive power limiter for the Telstar® 
Satellite. In 1964, he transferred to the Columbus Laboratory and was 
engaged in the development of broadband switching matrices and in a 
study of microstrip interconnection in high-speed integrated circuits. 
Since 1968, he has been a member of the Radio Transmission Labora- 
tory at Merrimack Valley involved in the development of microwave 
integrated circuits. Member, Tau Beta Pi, Eta Kappa Nu. 


R. R. Laans, B.S.E.E., 1962, University of Illinois; M.8.E.E., 1964, 
New York University; Bell Telephone Laboratories, 1962—. Mr. Laane 
has worked on the application of super-conductive switches to data 
processing systems and has investigated the application of optical 
processing techniques to data processing systems. Since 1967, he has 
been engaged in exploratory work on the application of semiconductor 
devices to telephone switching networks and on analog-to-digital 
conversion techniques. He is presently also working on solid-state 
Picturephone’”’ switching networks. Member, IEEE. 


SHEN Lin, B.S. (summa cum laude), 1951, University of the Philip- 
pines; M.A., 19538, Ph.D., 1968, Ohio State University; Bell Telephone 
Laboratories, 1968—. He has worked in the field of Turing machine 
theory, combinatorial analysis and number theory. At present, he is 
working on applications of computers in various optimization and 
number-theoretic problems. Member, AAAS, American Mathematical 
Society, Mathematical Association of America, SIAM, Phi Kappa Phi, 
Sigma Pi Sigma, Pi Mu Epsilon. 


Dietrich Marcust, Diplom Vorpruefung, 1952, Dipl. Phys., 1954, 
Berlin Free University; D.E.E., 1962, Technische Hochschule, Karl- 
sruhe, Germany; Siemens and Halske (Germany), 1954-57; Bell Tele- 
phone Laboratories, 1957—. At Siemens and Halske, Mr. Marcuse 
was engaged in transmission research, studying coaxial cable and circu- 
lar waveguide transmission. At Bell Telephone Laboratories, he has 
been engaged in studies of circular electric waveguides and work on 
gaseous masers. He spent one year (1966-1967) on leave of absence 
from Bell Telephone Laboratories at the University of Utah where he 
wrote a book on quantum electronics. He is presently working on the 
transmission aspect of a light communications system. Member, IEEE, 
Optical Society of America. 


CONTRIBUTORS TO THIS ISSUE 319 


Merton B. Purvis, B.S., 1944, M.S., 1949, Iowa State College; Ph.D., 
1954, Pennsylvania State University; Bell Telephone Laboratories, 
1955—. Mr. Purvis worked initially on development of network cool- 
ing, optics and photographie aspects of a large computer memory for 
the Morris Electronic Switching System. He subsequently engaged 
in research on the magnetic characteristics of ferreed switches used in 
electronic switching, development work on switches for the 60-90 MHz 
band and exploratory work on fluidic networks. He currently heads the 
Apparatus Design Department dealing with relays, crossbar switches 
and miscellaneous apparatus. Member, American Society of Mechan- 
ical Engineers, Sigma Xi, Pi Tau Sigma, American Society for the 
Advancement of Science. 


Davip C. Rirs, B.S.E.E., 1960, University of Washington; M.E.E., 
1962, New York University; Bell Telephone Laboratories 1960—. Mr. 
Rife has worked on data carrier systems, automatic calling units and 
data test equipment. Since 1963 he has been supervising the develop- 
ment of data station testing systems. He 1s working on his Ph.D. at the 
Polytechnic Institute of Brooklyn. Member, IEEE, Phi Beta Kappa, 
Tau Beta Pi. 


GrorGE A. VINCENT, B.S.E.E., 1966, Newark College of Engineering; 
M. E. (Electrical), 1968, Stevens Institute of Technology; Bell Tele- 
phone Laboratories, 1966—. Mr. Vincent is engaged in the development 
of data test equipment. He is presently involved in the development of 
a computer controlled Automatic Data Test Center. Member, IEEE, 
Tau Beta Pi, Eta Kappa Nu. 


Erratum 


On page 3440 of the December 1969 Bell System Technical Journal, 
Reference 6, “Comparison of An Energy Density Antenna System with 
Predetection Combining Systems for Mobile Radio” by W. C. Y. Lee, 
was mistakenly listed as an unpublished work. That article was pub- 
lished in the JEHE Trans. on Communication Technology, 17, No. 2 
(April 1969), pp. 277-284. 
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